Her Majesty Is that it is simple and allows simple hypotheses to be teste quickly. For example whether this set of columns is unique. But in general this is an ineffective tool for this class of tasks. Pivot table MS Excel Excel Pivot Tables are already a much more powerful tool. They are convenient for testing interesting hypotheses of meium complexity. For example is there a functional dependency between two sets of columns But this is all done by hand so you cant test many hypotheses.

Python In complex cases python is require. Interesting but I could only find one library that does this kind of architectural analysis out of the box git article . Dtool is written in python. It is clear that the library is not supporte by its creators. Whats frustrating I decide to write my own version of the bike which would make architectural analysis more convenient. The code is poste on github . Below in the case I will show the use of the tool on a real table.git clone h from s import Case study of table analysis from As a table I took a dataset of music tracks from e link . Its already in a nice flat form so we can start analyzing right away bypassing the boring ETL stage.

Project is blown up by

Colab with code here . It contains table analysis using my Python class Fs. Below I will describe the main findings. As an experiment you can look at the table header yourself and think about how you would do a review analysis. Table header mat Primary key what does the tool give us The first question is what is the primary key. No single column ensures that all rows are unique rows. And the combination of two columns e ensures uniqueness. In general this is strange one would think that track_id would be the primary key. But about thousand tracks have or more genres.

