Categories
DATA SCIENCE EMBEDDED PROGRAMMING PROGRAMMING PYTHON TUTORIALS WEB DEVELOPMENT

What Are Python Reserved Keywords?

Python reserved keywords are those words in Python programming language that have a specific meaning in a program. These keywords have specific actionable functionalities defined to it in Python. We are not allowed to re-use these reserved keywords. It is also not possible to override the keywords in Python.

Why Do We Have Python Reserved Keywords?

A programming language is defined by a set of keywords that have specific functionalities attach to it. Python programming language is no different from this. There are a set of keywords defined in Python language that performs specific tasks within the program where they are used.

Python Logo For Reserved Keywords In Python

For example, print is a keyword in Python which instructs the Python interpreter (i.e. the Python environment where Python programs run) to print a string to the output terminal. So a Python program line like:

print('Hello, World!')

will print the string:

Hello, World!

to the computer output screen that its user can see. We as a programmer are never allowed to use same keyword “print” for any other purposes like variable name or function name. Thus, we say that it is a Python reserved keyword.

Similarly, the keyword input is used to receive input from the user of a Python program. So a line in the program like:

user_name = input('Enter your name')

will display the string:

Enter your name

on the user screen and wait until the user enters his name. Once he enters the name and hits the “Enter” key, the name gets stored in the variable “user_name“.

So as you can see here, each of these reserved keywords such as print, input etc. each have a very specific functionality attached to it in Python language. We cannot use these same keywords as a variable name or function names. Trying to do so will result in the interpreter throwing error at us!

So now that we understand about reserved keywords in Python, what can we do about them?

For one, we need to know about all the Python reserved keywords to avoid using them in other ways in our program. But in addition to this, knowing about these reserved keywords and their intended functionalities will also help us write useful programs.

Using Reserved Keywords In Python Programs

Python programs are nothing but a bunch of reserved keywords used upon a set of variables to perform certain operations. So we use these set of keywords to write our programs. For example, if we take a look at the below program:

user_name = input('Enter your name')
print('Hello, ' + user_name + '!')

This program simply prompts for an user to enter his name. When he does so, it will just wish Hello to him by addressing his name. So when I run this program, the output I get is something akin to this:

'Enter your name'
> Amar
> Hello, Amar!

Conclusion

So in short, we can say that reserved keywords are a set of words in Python that have pre-defined meaning and functionalities associated with them. We make use of these keywords to write our program and we are not allowed to re-use the same words in our variables or function names. In other words, we are not allowed to alter their pre-defined meaning.

Categories
DJANGO PYTHON TUTORIALS WEB DEVELOPMENT WEB SERVER

Django Web Framework Beginner Tutorial – Introduction

What is Python Django?

Django is a Python based web development framework. It is a collection of libraries and tools that can be used to develop websites and web applications. Django uses Python as its primary backend programming language.

Learn more about Frontend & Backend components of a web app here

Why use Django Web Development Framework for developing a web app?

In the early days of internet, not many programming languages or supporting libraries were available for the development of websites. So, every website developer was writing many frequently used components repeatedly.

Python Django Web Development Framework

These included features like user authentication, database read/write, Cross Site Scripting (XSS)/malware protection code, database injection prevention code etc.

Every time a new website was built, web developer had to rewrite these pieces of code over and over again. This results in an increase in time to complete the project. It also exposes the website to certain vulnerabilities due to bad testing or bad design.

In order to overcome these, developers started to create a common web development framework. This contained all the frequently used components like authentication, session management code etc. These were later made available to others as part of the web framework libraries.

Soon enough, these libraries started being developed in different programming languages as well. Django is one such web application development framework that was developed using the Python programming language.

Why Use A Web Development Framework Like Django?

The main advantage of using Django is the number of readily available components it comes with. All the bells and whistles required to develop a basic web application is present in Django. Module like user management, admin dashboard, session management component, protection against XSS, CORS support are all readily available. This makes Django one of the quickest web development framework to get started with. You can go live with a website in no time because of this.

In addition to this, Django also comes with framework extensions such as Django Rest Framework (DRF) that can be also used to enhance the capabilities of a Django Web Application.

All these features of Django makes it one of the most appealing “all batteries inclusive” web app development framework in the tech industry.

In addition to this, if you are already familiar with the Python programming, then using Django becomes very easy.

Django is not the only Python web development framework. There exists other Python based web development frameworks like Flask, web2py and many more. But what makes Django different and easier to get started with is its all inclusive battery modules we discussed earlier.

Who is currently using Python Django Web Framework in real world?

Some of the top tech companies using Python Django includes Instragram, Quora, Mozilla, Disqus, National Geographic, Last.fm etc.

This was a theoretical introduction of Django. In the upcoming articles, we will get our hands dirty by using Django to develop a few simple web apps. This should give you a clear idea on the advantages of Django and why it is extremely useful.

Categories
DATA MINING DATA SCIENCE HTML JAVASCRIPT PROGRAMMING PYTHON STATIC WEBSITES TUTORIALS WEB DEVELOPMENT WEB SCRAPING

How To Extract Data From A Website Using Python

In this article, we are going to learn how to extract data from a website using Python. The term used for extracting data from a website is called “Web scraping” or “Data scraping”. We can write programs using languages such as Python to perform web scraping automatically.

In order to understand how to write a web scraper using Python, we first need to understand the basic structure of a website. We have already written an article about it here on our website. Take a quick look at it once before proceeding here to get a sense of it.

The way to scrape a webpage is to find specific HTML elements and extract its contents. So, to write a website scraper, you need to have good understanding of HTML elements and its syntax.

Assuming you have good understanding on these per-requisites, we will now proceed to learn how to extract data from website using Python.

Python logo on extracting data from a web page using Python
Python Web Scraper Development

How To Fetch A Web Page Using Python

The first step in writing a web scraper using Python is to fetch the web page from web server to our local computer. One can achieve this by making use of a readily available Python package called urllib.

We can install the Python package urllib using Python package manager pip. We just need to issue the following command to install urllib on our computer:

pip install urllib

Once we have urllib Python package installed, we can start using it to fetch the web page to scrape its data.

For the sake of this tutorial, we are going to extract data from a web page from Wikipedia on comet found here:

https://en.wikipedia.org/wiki/Comet

This wikipedia article contains a variety of HTML elements such as texts, images, tables, headings etc. We can extract each of these elements separately using Python.

How To Fetch A Web Page Using Urllib Python package.

Let us now fetch this web page using Python library urllib by issuing the following command:

import urllib.request
content = urllib.request.urlopen('https://en.wikipedia.org/wiki/Comet')

read_content = content.read()

The first line:

import urllib.request

will import the urllib package’s request function into our Python program. We will make use of this request function send an HTML GET request to Wikipedia server to render us the webpage. The URL of this web page is passed as the parameter to this request.

content = urllib.request.urlopen('https://en.wikipedia.org/wiki/Comet')

As a result of this, the wikipedia server will respond back with the HTML content of this web page. It is this content that is stored in the Python program’s “content” variable.

The content variable will hold all the HTML content sent back by the Wikipedia server. This also includes certain HTML meta tags that are used as directives to web browser such as <meta> tags. However, as a web scraper we are mostly interested only in human readable content and not so much on meta content. Hence, we need extract only non meta HTML content from the “content” variable. We achieve this in the next line of the program by calling the read() function of urllib package.

read_content = content.read()

The above line of Python code will give us only those HTML elements which contain human readable contents.

At this point in our program we have extracted all the relevant HTML elements that we would be interested in. It is now time to extract individual data elements of the web page.

How To Extract Data From Individual HTML Elements Of The Web Page

In order to extract individual HTML elements from our read_content variable, we need to make use of another Python library called Beautifulsoup. Beautifulsoup is a Python package that can understand HTML syntax and elements. Using this library, we will be able to extract out the exact HTML element we are interested in.

We can install Python Beautifulsoup package into our local development system by issuing the command:

pip install bs4

Once Beautifulsoup Python package is installed, we can start using it to extract HTML elements from our web content. Hope you remember that we had earlier stored our web content in the Python variable “read_content“. We are now going to pass this variable along with the flag ‘html.parser’ to Beautifulsoup to extract html elements as shown below:

from bs4 import BeautifulSoup
soup = BeautifulSoup(read_content,'html.parser')

From this point on wards, our “soup” Python variable holds all the HTML elements of the webpage. So we can start accessing each of these HTML elements by using the find and find_all built-in functions.

How To Extract All The Paragraphs Of A Web Page

For example, if we want to extract the first paragraph of the wikipedia comet article, we can do so using the code:

pAll = soup.find_all('p')

Above code will extract all the paragraphs present in the article and assign it to the variable pAll. Now pAll contains a list of all paragraphs, so each individual paragraphs can be accessed through indexing. So in order to access the first paragraph, we issue the command:

pAll[0].text

The output we obtain is:

\n

So the first paragraph only contained a new line. What if we try the next index?

pAll[1].text
'\n'

We again get a newline! Now what about the third index?

pAll[2].text
"A comet is an icy, small Solar System body that..."

And now we get the text of the first paragraph of the article! If we continue further with indexing, we can see that we continue to get access to every other HTML <p> element of the article. In a similar way, we can extract other HTML elements too as shown in the next section.

How To Extract All The H2 Elements Of A Web Page

Extracting H2 elements of a web page can also be achieved in a similar way as how we did for the paragraphs earlier. By simply issuing the following command:

h2All = soup.find_all('h2')

we can filter and store all H2 elements into our h2All variable.

So with this we can now access each of the h2 element by indexing the h2All variable:

>>> h2All[0].text
'Contents'
>>> h2All[2].text
'Physical characteristics[edit]'

Conclusion

So there you have it. This is how we extract data from website using Python. By making use of the two important libraries – urllib and Beautifulsoup.

We first pull the web page content from the web server using urllib and then we use Beautifulsoup over the content. Beautifulsoup will then provides us with many useful functions (find_all, text etc) to extract individual HTML elements of the web page. By making use of these functions, we can address individual elements of the web page.

So far we have seen how we could extract paragraphs and h2 elements from our web page. But we do not stop there. We can extract any type of HTML elements using similar approach – be it images, links, tables etc. If you want to verify this, checkout this other article where we have taken similar approach to extract table elements from another wikipedia article.

How to scrape HTML tables using Python

Categories
100DaysOfCode EMBEDDED PROGRAMMING PROGRAMMING PYTHON RASPBERRY PI TUTORIALS

Python Program To Add Two Numbers

In this article, we will look at how to write a Python program to add two numbers. This is a very simple Python program that is suitable for even a beginner in Python programming to work on it for getting hands on practice with Python.

In order to add two numbers in Python program, we need to first break down the problem into the following steps:

Breakdown of the problem of Adding two numbers

  1. Receive the first input number from the user and store it in a Python variable.
  2. Receive the second input number from the user and store it in another Python variable.
  3. Add the two numbers by adding the two Python variables using a Python statement (Learn about Python statement here)
  4. Store the final result in another Python variable called “result”
  5. Print the value of the “result” variable.

In the above breakdown of the problem, you will notice that the Python program we will be written in such a way that the two numbers that need to be added will not be hard coded directly into the program itself but instead is written in a generic way such that we prompt the user to enter these two input values every time the Python program to add two numbers is run. This type of programming approach is often called general programming as the program is generic enough to receive any two different values each time it is run.

Pseudo-code To Add Two Numbers Using Python Programming Language

num1 = Receive First Input Number From The User
num1 = Receive Second Input Number From The User
result = (num1) Added to (num2)
Print the (result) on the screen

We can see from the above psuedo code that this is a simple program that receives two numbers from the user, adds them and print their results back on the screen. The above pseudo code gives us a nice little framework on how to write our program. The same pseudo-code can now be used to write a program to add two numbers using any programming language and not just Python!

Now that we have our problem broken down and pseudo code written, it is time for us to replace the pseudo code with actual Python programming code instructions.

Python Program To Add Two Numbers And Print Its Result

Fire up your Python IDLE interpreter and start typing in the code along with me for you to be able to understand this program better:

Python 3.5.2 (default, Oct  8 2019, 13:06:37) 
 [GCC 5.4.0 20160609] on linux
 Type "copyright", "credits" or "license()" for more information.
>>>

Once in our python interpreter, let us start typing in our Python program commands. The first thing we need to do according to the pseudo code written above is the receive the first input number from our user. In Python program, the instruction code to be used to receive a value from its user is by calling the input() function. Input function will accept a string as its parameter that will be displayed to the program user when the program is run. So, with this knowledge, our first line of the Python program to add two numbers will be:

>>> num1 = input('Enter the first number\n')
Enter the first number
10
>>> 

As you can see from the above code block, we have used the input() function to prompt our program user with the string “Enter the first number“. We are also saving the value entered by the user to a Python variable called num1.

The input function will then prompt with the above string and wait until the user enters a number. In the above code snippet, I had entered a value of 10 which is now stored in the variable num1.

In a similar way, we will write the next line of code which will prompt the user to enter a second input number that is to be added to the first number. This is achieved with the following piece of code:

>>> num2 = input('Enter the second number\n')
Enter the second number
20
>>> 

Again over here, I have entered my second number as 20 when prompted.

Now that we have the two numbers in our Python variables num1 and num2, it is time add them to get the final result that we are going to store in our third Python variable called result. This is achieved using the following Python statement:

>>> result = int(num1) + int(num2)
>>>

So from the above python code, we have now added the numbers num1 and num2 using the Python arithmetic operator “+” and the typecast operator called “int()“. We had to use the typecast operator int because by default all inputs from the user into a python program will be interpreted and stored as a string. We can check this by issuing the following command in the interpreter:

>>> print(num1)
10
>>> type(num1)
<class 'str'>
>>> 

So, by calling the typecast operator int, we are converting this string value to an integer value.

>>> type(int(num1))
<class 'int'>
>>> 

finally stored the end result in a new variable called result. Since we have not issued a Python command to print the value of result, nothing gets printed on the IDLE prompt yet. So, the final step is exactly that. To print the value of the result variable onto the screen. This is achieved using another Python function called the print function.

>>> print (result)
30
>>>

Here is the full program that we can store in a file called add2num.py and run it using the command python3 add2num.py everytime we want to add any two numbers!

num1 = input('Enter the first number\n')
num2 = input('Enter the second number\n')
result = int(num1) + int(num2)
print (result)

This concludes our Python program to add two numbers.

Additional Side Notes On Python Programming Language:

Python is an interpreted programming language that gets interpreted and executed on the fly and hence the program written in Python do not need to be compiled like in the case of a C or Java programming language.

Categories
ARM ARM ARCHITECTURE ARM PROGRAMMING COMPUTER HARDWARE ELECTRONICS EMBEDDED EMBEDDED COMPUTERS EMBEDDED PROGRAMMING HARDWARE PROGRAMMING PYTHON RASPBERRY PI RASPBERRY PI PROJECTS TUTORIALS

Blinking An LED Connected To GPIO Pin Of Raspberry Pi Using Python

Introduction

If you are just getting started with Raspberry Pi, connecting a simple LED to one of the GPIO pins of a Raspberry Pi and controlling it using software program that you write will give you a very good grasp of how a computer hardware and its program works internally. You will realize how you can control various aspects of a computer hardware using software, how a computer works at the bit level, how to write Python programs to control hardware and more.

In summary, working on getting an led connected to a GPIO pin of your Raspberry Pi will help you in understanding the fundamentals of a computer architecture and computer science in general.

Raspberry Pi 3B

What You Will Learn From This Project?

Connecting an LED to the GPIO pins of a Raspberry Pi to control it is a simple Beginner Raspberry Pi Project that lets you learn more about:

  • Raspberry Pi hardware internals
  • General Purpose Input/Output (GPIO) pins of a Raspberry Pi
  • Raspberry Pi Register Set
  • Ohm’s Law
  • Python Programming
  • Python Library – Raspberry Pi GPIO library
  • The working of an Light Emitting Diode (LED)

What Hardware Is Required To Set Up A Blinking LED Project?

This a very simple, beginner friendly Raspberry Pi project that can be set up by anyone with minimal hardware or software knowledge. The hardware components required to set up this blinking LED project is also quite minimal. You need the following hardware components available with you to get it going:

  • Raspberry Pi Module
  • Solderless Breadboard
  • Keyboard
  • Monitor
  • Raspberry Pi Power Supply
  • SD Card with working Raspbian OS
  • Jumper wires for rigging up the circuit
  • LED
  • Resistor (1K Ohm)
  • Multimeter

Theory Behind How The Raspberry Pi Blinking LED Project Work

When you look at the Raspberry Pi board, you will see a bunch of pins protruding out. Among these, there is a row of 40 pins located on one side of the board as shown in the image below.

If you look closely enough in the above image, you will notice the label “GPIO” written right under it. These pins are called the GPIO pins or General Purpose Input Output pins. What the name GPIO implies is that these pins do not have any fixed functionality within the board and hence can be used for general purposes. It means that we can connect our LED into one of these pins and can turn it ON or OFF using these pins. But how?

How to control the Raspberry Pi GPIO pins programmatically?

Raspberry Pi 3 board runs on Broadcom’s ARM CPU chipset BCM2837. Among many other things, this processor chipset has a built in GPIO controller aka General Purpose Input Output controller. The 40 GPIO pins header shown in figure 1 is connected to 40 controllable pins of the GPIO controller. Now, we can control each of these pins individually by programming the appropriate registers inside this GPIO controller.

To understand how to program each of these pins using GPIO controller, we need to look into the Technical Reference Manual or datasheet of the Broadcom ARM chipset BCM2837.

In the BCM2837 SOC (System On Chip aka CPU) datasheet linked above, if we jump into page 89 we come across a dedicated chapter talking about General Purpose Input Output (GPIO). If we go through this chapter, we can learn about all the GPIO registers available and figure out the GPIO registers we need to program to turn ON or OFF the LED we are going to connect to the Raspberry Pi 3 GPIO pins.

As the name implies, GPIO pins can be configured as either an Input pin or an Output pin. When we configure a GPIO pin as an input pin, we are sending data bit (either 0 or 1) into the Raspberry Pi BCM2837 SOC i.e. data signal is sent from outside the board to inside the board (hence the name input). On the other hand, if we configure the Raspberry Pi GPIO pin as an output pin, the board will send the data bit signal (either 0 or 1) from inside the board to the outer world where any device connected to it will receive this signal.

So, if we want to control an LED that is connected to one of the Raspberry Pi’s GPIO pin, we need to configure that pin as a GPIO OUT pin (aka output pin) so that we can send an electrical signal from the Raspberry Pi board to the external LED connected to this pin.

The configuration of a GPIO pin to be an INPUT or OUTPUT pin is controlled by programming the GPIO Controller Register called GPIO Function Select Register (GPFSELn) where n is the pin number.

So for example, if we choose to use the GPIO8 pin to control the LED, i.e. we connect our LED to GPIO 8 pin, we need to program the GPFSEL register for the GPIO 8 pin and configure it as an Output pin. When we check the datasheet at page 91 and 92, we notice that GPIO pin is configured by setting the bits 26 to 24 in the GPFSEL register (that is field name FSEL8). And from the datasheet, we also find that to set the pin as an output pin, we need to set its value as 001 i.e. bit 26 is set to 0, bit 25 is set to 0 and bit 24 is set to 1.

So, if we can somehow set these values in the GPFSEL register using a programming language such as Python, we will be able to start controlling the LED connect to this pin!

If this is all overwhelming to you, do not worry. We will not have to scratch our head a lot for now as we can simply make use of Raspberry Pi’s GPIO Python library that helps us in making most of this work for us. But I just wanted to explain to you as to what this GPIO Python library is doing under the hood.

How To Connect An LED To Raspberry Pi GPIO?

Designing The Circuit

In order to connect an LED to GPIO pin 8 of Raspberry Pi, we need to first design and understand how the circuit is going to work.

Can we connect an LED directly to a Raspberry Pi GPIO pin without a resistor?

The answer is No. Raspberry Pi provides 3.3 Volts of power on its GPIO output pin according to Raspberry Pi datasheet specification. However, if we take a look at a standard LED, we notice that it normally operates at a much lower voltage. If we look at an LED specification, we notice that a typical LED usually operates at just 1.7 Volts and draws 20 mA. So, if we need to connect this LED to the GPIO pin of our Raspberry Pi, we need to bring down the voltage delivered by the pin to our LED to operate at or under 1.7V. How to do that? We connect a resistor in series with our LED so that the 3.3 Volts GPIO output of Raspberry Pi gets split between the resistor and our LED. By choosing a right value of the resistor such that it consumes 1.6 Voltage, we can ensure that LED finally gets only 1.7 Volts.

Calculating the resistor value to connect with LED and Raspberry Pi GPIO

In order to calculate the value of resistor that we should be using, we make use of the Ohm’s Law.

Ohm’s Law is defined using the equation:

V = I/R where V is the voltage, I is the current and R is the resistor value.

So, if we want to have V=1.6 Volts consumed by our resistor so that the current coming from GPIO pin is at I=20 mA, we need to connect a resistor whose value is:

1.6 = (20 mA)/R

or R = 80 Ohms (or approx 100 Ohms)

So, we choose a resistor of value 100 Ohms connected in series with our resistor to ensure that we only get 1.7 Volts and 20 mA of current, the optimum operating values as required by our LED.

A 100 Ohm resistor is identified by the color bands: Brown, Black & Brown.

Hook up the led through 100 Ohm resistor to GPIO 8 pin of Raspberry Pi as shown in the figure below:

Note that when you are hooking up the LED, the terminal pin that is longer is positive. Once you have connected as shown in the figure, it is now time to program the Raspberry Pi GPIO controller to start controlling the LED to turn ON or OFF.

We will be using Python to program our Raspberry Pi GPIO controller. Now, the simplest way to program this is by making use of the Python GPIO library.

To install the Python Raspberry Pi GPIO module, open up your linux terminal and type the following command.

sudo apt-get install python-rpi.gpio python3-rpi.gpio

Now the above command will install the required Python GPIO library module onto our Linux development machine. Once successfully installed, It is now time to start programming the Raspberry Pi GPIO controller.

We will be toggling our GPIO pins at 1 second intervals such that our LED will turn ON and OFF forever until the Python program we write will be terminated i.e., we will be running the code to perform infinite loop of toggling the GPIO 8 pin ON and OFF.

Create a new file on your computer by typing the following command in the terminal:

touch blinky.py

This should create our new program file called blinky.py

Open up this file using nano editor by typing the following command in the terminal:

nano blinky.py

Now that the file is opened, it is time to start writing our program to control the GPIO Pin 8 using Python GPIO library module.

First thing first, we will import the Python GPIO library module using command:

import RPi.GPIO as GPIO

Next, we will import python time library to perform 1 sec sleep operation between each GPIO toggle

from time import sleep

Next, we need to configure our GPIO library to use our GPIO physical pin numbering as seen on the Raspberry Pi board physically:

GPIO.setmode(GPIO.BOARD)

This ensures that when we say GPIO pin 8 in the program, it actually maps to the GPIO Pin 8 seen on the Raspberry Pi board.

Next, configure GPIO pin 8 to be a GPIO Out pin and set its initial output value to be low:

GPIO.setup(8, GPIO.OUT, initial=GPIO.LOW)

Finally we will start an infinite loop in Python such that we turn ON the GPIO 8 (by setting it HIGH) or turn it OFF (by setting it LOW) after every 1 second delay. This is achieved using the program below:

while True: # Infinite loop
    GPIO.output(8, GPIO.HIGH) # Turn GPIO 8 pin on
    sleep(1)                  # Delay for 1 second
    GPIO.output(8, GPIO.LOW)  # Turn GPIO 8 pin off
    sleep(1)                  # Delay for 1 second

That’s it, this should be all the program that we need to type in our blinky.py file and run it using the command:

python3 blinky.py

This should start turning your LED ON and OFF every second!

Here is the full code for your reference:

import RPi.GPIO as GPIO
from time import sleep

GPIO.setmode(GPIO.BOARD)
GPIO.setup(8, GPIO.OUT, initial=GPIO.LOW)

while True: # Infinite loop
    GPIO.output(8, GPIO.HIGH) # Turn GPIO 8 pin on
    sleep(1)                  # Delay for 1 second
    GPIO.output(8, GPIO.LOW)  # Turn GPIO 8 pin off
    sleep(1)                  # Delay for 1 second

This should conclude our tutorial on how to get a simple LED connected to a General Purpose Input/Output (GPIO) pin turning ON and OFF using a Python program that makes use of Python Raspberry Pi GPIO library. There can be many variants to this such as using other GPIO pins, connecting more than one LEDs to multiple GPIO pins and controlling them all in different ways to display interesting patterns on the LEDs. If we are even more curious, we can also figure out a way to control the BUILT-IN LEDs that are already present on our Raspberry Pi boards to bypass their current usage and be used for by own programs for our purposes.

We will dwell into these and many other interesting ways to make use of our Raspberry Pis to understand and learn more about the computer hardware, its architecture and much more in our future articles.

Categories
HTML JAVASCRIPT LAMP PHP PYTHON TUTORIALS WEB DEVELOPMENT WEB SERVER

Basic Structure Of A Web App/Website

A website or a web app is usually made up of the following 3 web components:

These components are arranged as shown in the following diagram.

Structure Of A Web App
Location of the web app components on the internet

Web App Back-End component

All the core logic of a web app or website is usually implemented in the web app’s back-end component. This includes all the algorithms of the web app, code to perform any storing and retrieval of data from the database, url based route handling etc. All these code forms the back-end of a web app or website and runs on a specialized computer called the web server.

Backend component of the above image

The code for back-end programming can be written using several different programming languages such as PHP, Python, Java, Javascript, Ruby etc. Each of these programming languages comes with their own advantages and disadvantages like for example choosing a language like Python comes with the great benefit of having several readily available libraries useful for data crunching activities, however they can be relatively slow compared to other programming languages. So which backend programming language to choose largely depends on the type of functionalities required for the particular web app that is going to be developed.

Web App Front-End Component

All the visual and interactive elements of a web app or website comes under the Web app’s front-end component. Whenever the user of a web app or website visits the website, he will only see and interact with the web app’s front-end component. So as far as he is concerned, a web app for him is usually just the front-end component. It is what he sees and interacts with in his web browser.

Front end component of the above image

However, these front-end components will not usually store all the relevant data of a web page within itself but instead, will query for them by sending requests to the web app’s back-end component that was discussed earlier. These requests are usually sent using the HTTP protocol.

The programming languages used to write web app’s front end component includes HTML, CSS & Javascript. Among these, HTML is a markup language that a web browser will use to interpret what HTML components needs to be drawn on the browser screen to represent the website. CSS is a styling language used to customize the style (like change the color, size, background color etc.) of these HTML components. Finally Javascript is a programming language that can be used to add interactive functionalities to these HTML components.

With the help of these 3 programming languages, your front end should be highly interactive and user friendly for any non technical person to start using your web app.

Database component

A database is a specialized software used to store and retrieve data efficiently on a computer or a server. The database can sit in a web server along side the server software or can be present in its own seperate dedicated database server.

Database component

Databases are usually used in a web app (or website) to store all relevant data of that web app such as user data, session data, web app specific data etc. There are many different forms of databases available such as relational databases, NoSQL databases, Document Oriented databases, Graph databases etc. Each of these variants of databases have certain unique features that are useful in certain specific situations. The most common database type used in web apps are usually relational databases such as MySQL, PostgreSQL. However Document Oriented databases such as MongoDB are also used frequently.

So these 3 components forms the fundamental elements of a web application. Of course there can be more than these 3 components required as your web app continues to grow. You may need to add multiple servers, load balancers to manage higher traffic, caching mechanisms etc. We will discuss more on these modules further in future articles, but having the knowledge of above three fundamental components of a web app should give you the best start to learn and start working on the development of your first web application.

Do comment below if you liked the article or if you have any questions regarding the above topic and I will be happy to answer your questions. Until then, happy learning! 🙂

Categories
PYTHON TUTORIALS

How to scrape HTML tables using Python

Python is a versatile programming language that can be used to write programs of varied applications. The number of available libraries in Python makes it one of the most useful programming languages that can be used to perform numerous tasks. Be it writing a simple Python script to automate basic shell command operations in an Operating System, or a program to perform data analysis or Machine learning, Python excels them in all, thanks to the available Python Library packages.

In this article, we will explore and learn about using Python programming language to perform one of the most common application in the world of web, HTML scraping or web scraping using Python.

Web scraper Illustrative picture

All the websites we view in our favorite web browser is written using mainly 3 important web front-end programming languages – HTML, CSS and Javascript. Each of these 3 programming languages have a specific role to play in the creation of a web page. They are:

HTML – HTML is a simple Markup language used to create various HTML elements that make up a web page. The elements including Headings, Paragraphs, Lists, Images, tables, headers and footers, links etc that we see in a web page are all different HTML elements. So in other words, HTML Markup language is used to create these HTML elements that we see as part of a web page. HTML here stands for Hyper Text Markup Language.

CSS – CSS is a design style programming language that is mainly responsible for implementing the look and feel of the above mentioned HTML web page elements. You might have seen that same contents of a table are displayed in two different styles in two different websites. This is because, even though both use the same HTML Table element to create this content, the HTML Table is styled in different formats by each of these websites. This is achieved using the CSS programming language. CSS here stands for Cascading Style Sheets.

Javascript – Javascript is another programming language that was mainly developed for use in web browsers, but nowadays has made its way into all parts of web development – be it in the front end (browser side) or at the back end (server side).  Javascript programming language on the front end side is used to provide interactive functionalities to the HTML elements of a web page. For example, In most of the web pages that we see these days, we might have seen the infinite scrolling feature where in only first few content elements are loaded in a web page and the rest are loaded dynamically as we scroll to the bottom of the web page. Twitter home page is a good example of this. This sort of interactive functionalities are added using Javascript language in a web page. Almost all interactivity of a web page is achieved using the help of Javascript these days.

When a web page is rendered in a browser on the user’s computer, the webpage includes all these HTML elements with all the texts and image content of the web page all embedded within themselves. So, we can actually retrieve these text and image contents from a web page using a programming language such as Python. Such a process is actually called “Web Scraping” in the web development world.

Scraping A Web Page Using Python

In order to learn how to scrape a web page using Python, we will try to scrape a table that lists mountains across the world ordered by their elevation, as seen in the the official Wikipedia website:

https://en.wikipedia.org/wiki/List_of_mountains_by_elevation

In this Wikipedia web page, we notice the presence of several tables. The first table mainly displays list of mountains having elevation of 8000 meters or above. It is this web page’s table that we would like to scrape using Python.

Introduction to BeautifulSoup library in Python

As mentioned in the beginning of this article, Python comes with myriad of useful libraries that one can use to perform complex tasks with ease by using these libraries’ APIs. One such library is called the “BeautifulSoup” library and is one of the most interesting library that one can use in Python to perform web scraping.

BeautifulSoup Python library’s functionalities

One of the most important functionality of Python’s BeautifulSoup library is its ability to parse and interpret HTML tags. All html elements are represented using what are called the HTML tags. Some examples of such tags are <h1> for main heading, <p> for paragraphs and <table> for tables. Python’s BeautifulSoup library understands these tags and can extract information present in a web page within these tags. BeautifulSoup library exposes these APIs to us to use these functionalities in our own Python programs, which we will make use of in our Python web scraper program that we are about to write.

BeautifulSoup library is available in Python libraries repository under the name of ‘bs4’ and can be installed into your computer system for developing the web scraper using the command:

pip install bs4

BeaultifulSoup library example

In order to understand how a BeautifulSoup library works, let us download a Wikipedia web page into our local system. For this example, let us download the following Wikipedia web page:

https://en.wikipedia.org/wiki/List_of_mountains_by_elevation

Let us save the web page from above link as mountains.html in our local home directory (~/).

We can then read the content of this web page using Python’s BeautifulSoup library using the following commands:

from bs4 import BeautifulSoup

input = open('~/mountains.html', 'r')

soup = BeautifulSoup(input.read(),'html.parser')

tables = soup.find_all('table')

print tables

Well, thats a mouthful of code you just read there. Let us try to understand it in a step by step manner to simplify it and understand what we are doing here:
The first line:

from bs4 import BeautifulSoup

Simply imports the BeautifulSoup library form the Python’s bs4 library we just installed. The next line:

input = open('~/mountains.html', 'r')

is simply using Python’s file operation function open( ) to open the previously downloaded mountain.html web page. In the next line:

soup = BeautifulSoup(input.read(),'html.parser') 

we call the BeautifulSoup function and pass it as one of the argument, content of our mountain.html webpage using the Python’s standard file operation function read( ). Another argument that we pass along is ‘html.parser’. This tells the BeautifulSoup function to interpret the content of the passed input content as HTML data and use HTML parser to parse it. The resulting parsed HTML data is assigned to the variable ‘soup’ for later usage. In the next line we do this:

tables = soup.find_all('table')

What the above line shows is that we are now searching for all the available HTML tables in the ‘soup’ variable and assign it to a new variable tables. So, by now we should have all the HTML tables present in mountain.html file assigned to the Python list variable ‘tables’.

Finally, we print the content of this tables variable that should print all the tables found in our mountains.html web page!

While this is good and all, we did a manual download of the Wikipedia web page, saved it as mountain.html and only then used Python’s BeautifulSoup library to process it. However, wouldn’t it be great if we could eliminate this manual step and do even this programmatically? As a next step, we would do exactly this using a new Python library – urllib introduced next.

Introduction to Python Urllib library

Another important Python library that we are going to use to create our web scraper program is called the urllib library. Let us see what functionalities Python’s urllib library brings to us.

Python’s Urllib library is used to fetch contents of web page url. It provides us with APIs such as open(), read() etc to open a web page and read its contents back. Url here stands for Uniform Resource Locators. They are the static web addresses that one can use to locate a web page and read/fetch its contents back.

How to install Python Urllib library?

We can install the Python Urllib library using the following pip command:

pip install urllib

Python Urllib Example

Here is a simple example of urllib library that is used to fetch the content of a Wikipedia web page.

First we will import the urllib library into our Python program environment using Python’s import command:

import urllib

The Urllib library exposes several useful APIs for other programs to make use of. One such API is the request API that one can use to open a web page and read its content. The request API in turn exposes two more functions called the urlopen( ) function and the read( ) function. An example of a Python program using this API is given below, where we are trying to read the contents of a Wikipedia web page:

import urllib.request

content = urllib.request.urlopen('https://en.wikipedia.org/wiki/List_of_mountains_by_elevation')

read_content = content.read()

We can actually combine the above two function calls of the Urllib’s request API – urlopen( ) and read( ) functions into a single line as shown below:

source = urllib.request.urlopen('https://en.wikipedia.org/wiki/List_of_mountains_by_elevation').read()

Python Web Scraper using Urllib and BeautifulSoup libraries

Finally, combining the APIs provided by both BeautifulSoup and Urllib libraries, we can write our web scraper program that reads a Wikipedia page’s contents, extracts its tables, and print the content of a particular table as shown below:

from bs4 import BeautifulSoup
import urllib.request

source = urllib.request.urlopen('https://en.wikipedia.org/wiki/List_of_mountains_by_elevation').read()
soup = BeautifulSoup(source,'html.parser')
tables = soup.find_all('table')
table_rows = tables[0].find_all('tr')
for tr in table_rows:
print (tr)

The above program is our intended Python web scraper program that can go fetch a Wikipedia page using urllib library. We can then extract all the contents of the web page and find a way to access each of these HTML elements using the Python BeautifulSoup library.

Here we are simply printing the first “table” element of the Wikipedia page, however BeautifulSoup can be used to perform many more complex scraping operations than what has been shown here.

I will explain more such operations one can perform using BeautifulSoup Python library in future articles, but this should serve as an entry point for someone who is just getting started with Python programming language for web scraping.


Categories
PYTHON TUTORIALS

Difference between expression and statement in Python

A Python expression can be defined as any element in our program that evaluates to some value. Well, what does this mean? To understand it better, let us fire up our Python interpreter and take a deep dive into this topic on Python expressions with these examples.

Once in our python interpreter, let us type the following command:

Python 3.5.2 (default, Nov 12 2018, 13:43:14) 
[GCC 5.4.0 20160609] on linux
Type "copyright", "credits" or "license()" for more information.
>>> 4
4
>>> 

We can see that by simply entering the number ‘4’ into our Python interpreter, it was accepted and evaluated to be of a value of integer 4. Hence, we can say that the input ‘4’ we entered is a type of expression.

Similarly, if we input the command ‘4 + 1’ to the Python interpreter:

Python 3.5.2 (default, Nov 12 2018, 13:43:14) 
[GCC 5.4.0 20160609] on linux
Type "copyright", "credits" or "license()" for more information.
>>> 4
4
>>> 4 + 1
5
>>>

Our interpreter goes ahead and computes a value of 4 from this and results in a value of 5. Here too, the input ‘4+1’ can be called an expression as it resulted in a value of 5.

Similarly, if we enter this code to the Python interpreter we get,

Python 3.5.2 (default, Nov 12 2018, 13:43:14) 
[GCC 5.4.0 20160609] on linux
Type "copyright", "credits" or "license()" for more information.
>>> 4
4
>>> 4 + 1
5
>>> "Hello" + "World"
'HelloWorld'
>>> 

This too shows that irrespective of the data type used (string in this case as opposed to integers in the earlier examples), a Python expression results in the evaluation of the data (“Hello” and “World”) to a final value (“HelloWorld”). Thus “Hello” + “World” is also a Python expression.

On the other hand, if we take a look at this example:

Python 3.5.2 (default, Nov 12 2018, 13:43:14) 
[GCC 5.4.0 20160609] on linux
Type "copyright", "credits" or "license()" for more information.
>>> 4
4
>>> 4 + 1
5
>>> "Hello" + "World"
'HelloWorld'
>>> result = "Hello" + "World"
>>> result
'HelloWorld'
>>> 

Here we are assigning the final evaluated expression value to another variable ‘result’. This type of command where a value is assigned to a variable is called a Python Statement.

So in other words, we can see that a Python statement is made up of one or more Python expressions.

Expression Vs Statement

  • Expression
    • Expressions always returns a value
    • Functions are also expressions. Even a non returning function will still return None value, so it is an expression.
    • Can print the result value
    • Examples Of Python Expressions: “Hello” + “World”, 4 + 5 etc.
  • Statement
    • A statement never returns a value
    • Cannot print any result
    • Examples Of Python Statements: Assignment statements, conditional branching, loops, classes, import, def, try, except, pass, del etc

Summary

In simpler terms, we can say that anything that evaluates to something is a Python expression, while on the other hand, anything that does something is a Python statement. Curious to learn further? Follow our other articles in this blog to know more!