I’ve been teaching Python for a while. Whilst, it is one of the easier programming languages, it still takes time to learn, if you’ve never coded before. Whichever way you try to learn whether through books or lectures, you’ll need to spend even more time practicing your Python coding in order to get to grip with the language.
Even if you have coded before, there are some differences with other languages. Admittedly though, the fundamentals of most programming languages and the way you’d think about implementing algorithms is similar. Here I’ve written done some common mistakes and problems that folks tends to face when learning Python.
Python is case sensitive
Most programming languages are case sensitive including Python. Hence for example, a variable x is different to X, the same for imported modules etc. One of the main exceptions is VBA which isn’t case sensitive when it comes to things like variables.
Creating different conda environments instead of using base
By default when you install Anaconda Python by default it creates a base environment. In general, it’s better to create different conda environments and use them to install your libraries. Using different conda environments, allows you to install different Python versions and also different versions of libraries too. If you have problems with any of these conda environments you can delete them. If you just use your base environment, and you have issues with it, you’ll have to reinstall Anaconda.
Variables or imported libraries seem to be missing in Jupyter notebooks
In most cases you might declare a variable or import a library earlier on a notebook. When reopening it, unless you execute that earlier cell, you’ll have issues later on. I often end up importing libraries several times in a notebook (or variables), as needed so I don’t end up having to execute all the earlier cells.
Starting Jupyter notebooks in the wrong conda environment
You’ve spent ages getting a conda environment installed with all the right libraries you want, but it doesn’t appear to be case that you can import any of the libraries in your Jupyter notebook. In practice, it’s likely you didn’t activate the right conda environment before starting the Jupyter notebook
Trying to understand os.environ
If you’re trying to access external services like Quandl etc, you’ll often need to provide an API key, usernames and passwords etc. It probably isn’t a great idea to store them in your Jupyter notebook or your Python source code. It makes it difficult to share your code, and you might end up mistakenly storing the credentials in your version controlled storage like GitHub. One easy way to get around this is to set your credentials as environment variables, which can be accessible in your Python code using os.environ which is like a special Python dictionary. There’s also the keyring library which can access credentials in Python.
Installing a conda environment which has all the common libraries you’ll need
Anaconda Python comes with lots of libraries you’ll need for data science. However, there will still be some libraries you might need to install with either conda or pip, that aren’t part of the standard Anaconda stack. For some libraries which require external dependencies, it can be easier to install them using conda, such as blpapi, Bloomberg’s Python API.
Note that it can be tricky to get libraries versions to work together and conda helps somewhat, to find conflicts. Although, it can take a while for condas algorithm to resolve conflicts (and can stall at times). One alternative is to use mamba, which is a much faster implementation of conda.
I’ve created conda environments for Windows, Linux and Mac, for my teaching which has lots of data science libraries, financial libraries (including Cuemacro ones!) and much more on my teaching GitHub site, where I’ve spent quite a bit time making sure the versioning of the various libraries match. See this tutorial on installing my conda environments.
Upgrading major libraries like pandas, without testing properly
If you’ve ever done anything with time series in Python, it’s likely you’ve used Pandas in lots of your code. As a result, upgrading Pandas versions can impact your code. It’s a very good idea to test your code both before and after upgrading Pandas, given that the API can change somewhat between versions (in particular before 1.0). From my experience, I’ve often had to make small changes in my code when upgrading Pandas, which have taken up a few hours. If you’re working in team, make sure everyone is on board before updating Pandas! Whilst, I’ve given Pandas as an example, it impact any library which is used in your code.
Conclusion
Whilst, there’s lots of problems folks face when learning Python, the ones above are some of the most common ones I’ve been asked about. One of the reasons I’ve written it is also to show that if you’ve come up against some of these, you aren’t the only one!