I am working on a python script. The script is using the IEX Cloud API with a python library called pyEX. For most APIs, you… Read More »Using environment variable in Python to authenticate against 3rd party API.
This is the third part of the articles I am writing about my little project I am working on. In part1, I created a web scraper to get the data I needed. In part 2, I added support to save the collected data to a MongoDB database. Now in this part, I will look into how to clean up and add new features (columns) to the collected data to make it more suitable for analysis.
My primary motivation here is to learn new technologies as I progress, so my baby steps may not be the state of art in this particular area and all tips and tricks or corrections are welcome.
For this project, I am using python and each day I love it more and more. There are some cool libraries for python such as pandas that will be used. There are some cool tools such as python notebooks that will be also used.
For starts, make sure that you have jupiter notebook installed on your machine and then start Jupyter Notebooks from the git repo folder.
pip install jupiter cd /path/to/stackjobs jupiter notebooks
With this command, we started the iPython (Jupyter) notebooks and a new browser will be opened. Click on the Enhancing and Extending data with Pandas notebook to see and run the code that this article with describe. Also, the enhance_data_with_pandas.py file contains the same code, so it can be run without iptyhon notebook.Read More »Playing With Pandas
Let’s continue with our project. To summarise what we did in the first part, we wrote a scraper in python using the scrapy framework that was capable of… Read More »Scraping Stackoverflow Careers for Fun and Profit – Part 2
When you want to learn something new the best way to do is to come up with a problem that can be useful to you… Read More »Scraping Stackoverflow Careers for Fun and Profit