Europe DbPedia 2017

europeflagdbpedia
I was kind of bored while doing some work, and just past week we were discussing with a colleague about my vague-places project.

This project was forgotten in time, but today I’ve blown the dust away, recovered it, and updated the Europe DbPedia map.

Image overview

Most of you won’t be interested in the full story. So here, see a set of results. If something picks your interest (like, why Portugal has almost no points) just keep reading 😀

Fixes and improvements

Coming back to my project, I had to fix a couple of issues (check the last commits). It was mostly about incorrect class checking, and a change by dbpedia. On the past i used dbpedia-owl tag, now it is called dbo.

This was the only fix needed to make it work. But I added another improvement. The python script was thought as a kind of interactive application, it gives you all feedback to the stdout. A nice new option --reportFile lets us dump the results to a reportFile instead of just printing.

Cmaking alpha shaper

Alpha shaper is a small application generating the alpha shape of the resulting points. It uses CGAL, and I’m happy to see there’s a CMakeLists.txt file. I only had to reinstall some dependencies (CGAL library) and it was good to go. It’s not very difficult to build.

Running

I want to retrieve as many Europe points as possible, the idea is to OR with various names of common places. I’m using: city, town, village, coastal. This differs with the retrieval of the full dataset, and of course will have less points.

python vagueplaces.py --query city town village capital coastal --live --alpha 1 data.csv --reportFile alpha1_report.txt

I run this command with different alphas: 2,5,10.

Additionally it seems that the program recommends me to use an alpha of ~1900, we’ll try that too.

Obtained Dataset

First let’s have a look at the obtained Europe Dataset

Dbpedia Europe overview

Dbpedia Europe overview

With a quick glance wee see that there’s a lot of density in Europe as a whole. Eastern countries have even more density than other countries like Portugal or France. But overall seems like a good dataset to work with.

Europe dbpedia results

Europe dbpedia results

There are some outliers on this dataset as shown with this whole world map. The most jarring one being the long trail of points in Russia. The other ones being colonies (I assume) from France, UK, the Netherlands… etc.

Alpha Shaper

So far I’m happy with the result, and already surprised that this thing is still working :-D. Now it’s time to check the results from the alpha shaper provided by CGAL. On my previous post about Europe I just talked about the points, but now we’ll also check the polygons.

A picture is worth a thousand words.

Europe Shape Comparison.

Europe Shape Comparison.

The results are not heartbreaking, but not what I expected. Alphas of 10 and 5 approach the Europe polygon. I’m not showing the result of the alpha 1900 run as it is almost a convex hull.

Alpha 2 gets a closer result to what Europe looks like. But, landmasses like UK are still recognized as the same polygon. I can live with that since a smaller alpha should take care of this. Even worse, there are some artifacts that should not be there. Just look at the alpha 2 and 10 representation of the points in Russia, those points should belong to the same polygon, but instead we are obtaining other islands.

Seeing the results I reduced the alpha to 1, expecting a more sharp polygon. But this is what I get:

Europe Alpha Shape 1

Europe Alpha Shape 1

I did not dive into why the alpha shape 1 is giving me this results. This is something that I have in my mind as PENDING. I understand that lower alphas can have more islands, but this result points to something wrong in my approach.

Other Goodies

Since I have the dataset here, I want to repeat a similar map to the one that I provided in 2012, for comparison. I did not try to reproduce the same map (in colorscheme and style) but it presents a similar dataset.

The main difference between the maps is how the data was obtained. The 2012 map presents all wikipedia points with x,y data and yago:EuropeanCountries, while this 2017 one is generated with a more sparse query on the Abstract of the points. This causes the new dataset to have less points than the one used in 2012.

Europe DBpedia 2017

Europe DBpedia 2017

DBpedia Europe Points

DBpedia Europe Points 2012

Conclusions

I had a little fun doing this, and my mind started to boil some improvements and a more “live” view. But I always think in new stuff to do and most of it does not get done 😛

References

Vague places github ⇒GO
Vague places original report ⇒GO

Advertisements

Leave a comment

Filed under curious, gis, Maps

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s