Which Programming Language Should Be Used for Data Science and AI
In our last Kiel.AI Meetup we were discussing the differences between programming languages typically used in data science and AI.
We had three talks, each presenting one programming language and its specific characteristics.
We started with a talk from Matthias Nannt, who presented Python and clearly emphasized the advantages of Python due to its roots in computer science and the tech industry, which is also the root of todays machine learning algorithms and in general of cause also for productive software environments. A particular reason for its popularity within programmers and tech companies is its clear syntax rules and its readability, which avoids the usual brackets and parentheses and uses tabs instead.
For productive environments and machine learning algorithms Python is currently by far the most popular programming language.
The second talk was by Jonas Mielck, who was presenting Stata, a software very popular in the field of economics, particularly because it also provides a nice graphical user interface for conducting statistical analyses. It can be compared to the software SPSS and its relevance in the social sciences, which has similar characteristics. However, in contrast to Python and R these softwares are not for free and only provide a very restricted functionality, which is why Jonas himself was quite unsure on what might be the long-term perspectives of Stata considering its strong limitations. Nevertheless, softwares (with its corresponding scripting languages) like Stata and SPSS are still the most used programmes for statistical analyses in higher education and still taught to new generations of students.
The last talk was from Steffen Brandt on R, which in contrast to Python has its roots in statistics and is optimized towards the needs of statisticians. In academia it is currently slowly replacing statistical software such as Stata and SPSS. Considering statistical methods, R provides the most comprehensive set of function libraries, going beyond was is available in Python. Which is also the reason why Python is not an option for many academic disciplines, which rely on certain statistical methods that do not exist in Python (e.g., in Medicine, Biology, Education, Psychology, economics and others). Considering the programming, R is much more individual and gives programmers a lot of freedom on their particular "programming style", which might be appealing if you work by yourself but is more of a problem when you program in a larger team, and everyone needs to be able to read the code of everyone else.
Here you can also download the presentation slides for Python, Stata, and R.
Member discussion