This project “Web Scraping with Python” consists of given the direction of a website to analyze its content to obtain information to improve the SEO positioning of the website.
Tasks such as:
. Get the content of a web page
. Get internal links from the website
. Get the metatag title of each page
. Get the metatag description
. Get the metatag h1
. Save the results in a file for your consultation (txt, excel ..)
. Analyze if the website meets the recommended SEO premises
As python is an interpreted language, we can implement the solution continuously, first doing a basic version and refining the program until we find the final solution to the problem.
We will be implementing the program step by step until we have implemented the functionalities that we require.
The course is organized into classes to follow the development of the project step by step.
The student will be given the source code of each step to follow.
At the end of the classes, a series of Tasks will be proposed to the student to improve the solution and learn other concepts.
At the end, the student will be given an Ebook with the course content.
Project Web scraping with python
Class 1: Definition of the problem and Development environment
Class 2: Reading the content of a web page with urllib
Class 3: Obtain internal links from a web page
Class 4: Decode links and see accents
Class 5: Create list of valid links
Class 6: Find all internal links on the website
Class 8: Get metatag description
Class 9: Student task: Get tag <h1>
Class 10: Save results to a file
Class 11: Conversion to Python version 3.6
Class 12: Student task: Generate HTML file to display in internet browser
Class 13: Student task: Rewrite program with Beautiful Soup module
If you want to get started with the syntax of the Python language, you can also follow the free online python course where you will also see concepts such as data structures, functional programming, modules and libraries, etc.
Python languages are having a lot of acceptance in recent years due to the speed with which we can develop our applications, the large number of libraries and modules that we can have to make our developments and that makes it ideal for Machine Learning projects, Analysis of data and Artificial Intelligence, among others.
Three paradigms of programming are condensed in the Python language, such as: traditional imperative programming (like other C-type languages), object-oriented programming (like that developed in languages like Java or C #) and functional programming (like Scala or Lisp). This provides us with a range of programming options that we will hardly find in other programming languages.
It may interest you