Problem definition and development environment
The idea of this project is to develop a real project with the Python programming language to introduce us to the different functionalities of the language and see practical cases.
In this project, the Web Scraping technique will be developed, which consists of analyzing a website to obtain information from web pages.
This technique can be used to obtain data for statistics, performance, etc.
In our case we will focus on obtaining data from the website to improve the SEO positioning of our website.
Before starting the project we must choose a development environment for python to be able to test the programs that we will be building.
You can choose the development environment for python that you like the most, in my particular case I have tried two that I now indicate:
This development environment is mainly used for Java but it can be used for other languages or projects like HTML, C ++ or Python.
To work with this IDE you must download and install
The Java JDK 8
You can download the NetBeans 8.2 IDE from
In order to work with Python we downloaded the Python plugin for NetBeans
. we unzipped it
. open NetBean and go to Tools – Plugins – Downloaded tab
. click on “Add plugins ..” and select the downloaded plugin files.
. we install and restart NetBeans
Visual Studio 2017
We will go to the page
And we downloaded the Community version of Visual Studio
Once installed we select the Python Tools for Visual Studio and install them.
If you do not want to download any IDE you can use it in an interactive environment on the page https://repl.it that you can use directly from any web browser on pc, tablet or mobile.
We will test our project with Netbeans but you can use the IDE that you like the most.
. We open the Netbeans
. we create the webscrap project with the menu option File – New Project
. select Python – Python project ant
. We give it the name webscrap and create the project
For the examples we will use the version of Python 2.7 if you have another version the examples may vary.