0% found this document useful (0 votes)

88 views

Python Urllib3 - Accessing Web Resources Via HTTP

Uploaded by

Juan Cuartas

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

88 views

Python Urllib3 - Accessing Web Resources Via HTTP

Uploaded by

Juan Cuartas

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 19

ZetCode

All Spring Boot Python C# Java JavaScript Subscribe

Python urllib3
last modified July 6, 2020

Python urllib3 tutorial introduces the Python urllib3 module. We show how to grab data, post data,
stream data, work with JSON, and use redirects.

ZetCode has also a concise Python tutorial.

The Hypertext Transfer Protocol (HTTP) is an application protocol for distributed, collaborative,
hypermedia information systems. HTTP is the foundation of data communication for the World
Wide Web.

Python urllib3
The urllib3 module is a powerful, sanity-friendly HTTP client for Python. It supports thread
safety, connection pooling, client-side SSL/TLS verification, file uploads with multipart encoding,
helpers for retrying requests and dealing with HTTP redirects, gzip and deflate encoding, and
proxy for HTTP and SOCKS.

$ pip install urllib3

We install the urllib3 module with pip.

Python urllib3 version

The first program prints the version of the urllib3 module.

version.py
#!/usr/bin/env python3

import urllib3

print(urllib3.__version__)

The program prints the version or urllib3.

$ ./version.py
1.24.1

This is a sample output of the example.

Python urllib3 status

HTTP response status codes indicate whether a specific HTTP request has been successfully
completed. Responses are grouped in five classes:

Informational responses (100–199)

Successful responses (200–299)
Redirects (300–399)
Client errors (400–499)
Server errors (500–599)

status.py
#!/usr/bin/env python3

import urllib3

http = urllib3.PoolManager()

url = 'http://webcode.me'

resp = http.request('GET', url)

print(resp.status)

The example creates a GET request to the webcode.me. It prints the status code of the response.

http = urllib3.PoolManager()

We create a PoolManager to generate a request. It handles all of the details of connection pooling
and thread safety.

url = 'http://webcode.me'

This is the URL to which we send the request.

resp = http.request('GET', url)

With the request() method, we make a GET request to the specified URL.

print(resp.status)

We print the status code of the response.

$ status.py
200

The 200 status code means that the request has succeeded.
Python urllib3 GET request
The HTTP GET method requests a representation of the specified resource.

get_request.py
#!/usr/bin/env python3

import urllib3

http = urllib3.PoolManager()

url = 'http://webcode.me'

resp = http.request('GET', url)

print(resp.data.decode('utf-8'))

The example sends a GET request to the webcode.me webpage. It returns the HTML code of the
home page.

req = http.request('GET', url)

A GET request is generated.

print(resp.data.decode('utf-8'))

We get the data or the response and decode it into text.

$ ./get_request.py
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<title>My html page</title>
</head>
<body>

<p>
Today is a beautiful day. We go swimming and fishing.
</p>

<p>
Hello there. How are you?
</p>

</body>

This is the output.

Python urllib3 HEAD request

A HEAD request is a GET request without a message body.

head_request.py
#!/usr/bin/env python3

import urllib3

http = urllib3.PoolManager()

url = 'http://webcode.me'
resp = http.request('HEAD', url)

print(resp.headers['Server'])
print(resp.headers['Date'])
print(resp.headers['Content-Type'])
print(resp.headers['Last-Modified'])

In the example, we create a HEAD request to the webcode.me website.

print(resp.headers['Server'])
print(resp.headers['Date'])
print(resp.headers['Content-Type'])
print(resp.headers['Last-Modified'])

The response object contains the headers dictionary, which has the various header fields, such as
server and date.

$ ./head_request.py
nginx/1.6.2
Thu, 20 Feb 2020 14:35:14 GMT
text/html
Sat, 20 Jul 2019 11:49:25 GMT

From the output we can see that the web server of the website is nginx and the content type is
HTML code.

Python urllib3 HTTPS request

The urllib3 provides client-side TLS/SSL verification. For this, we need to download the certifi
module. It is a carefully curated collection of Root Certificates for validating the trustworthiness of
SSL certificates while verifying the identity of TLS hosts. It has been extracted from the Requests
project.

$ pip install certifi

We install certifi.

import certifi

print(certifi.where())

To reference the installed certificate authority (CA) bundle, we use the built-in where() function.

status2.py
#!/usr/bin/env python3

import urllib3
import certifi

url = 'https://httpbin.org/anything'

http = urllib3.PoolManager(ca_certs=certifi.where())
resp = http.request('GET', url)

print(resp.status)

We create a GET request to the https://httpbin.org/anything page.

http = urllib3.PoolManager(ca_certs=certifi.where())

We pass the root CA bundle to the PoolManager. Without this CA bundle, the request would issue
the following warning: InsecureRequestWarning: Unverified HTTPS request is being
made. Adding certificate verification is strongly advised..

Python urllib3 query parameters

Query parameters are the part of a uniform resource locator (URL) which assigns values to
specified parameters. This is one way of sending data to the destination server.

http://example.com/api/users?name=John%20Doe&occupation=gardener

The query parameters are specified after the ? character. Multiple fields are separated with the &.
Special characters, such as spaces, are encoded. In the above string, the space is encoded with the
%20 value.

query_params.py
#!/usr/bin/env python3

import urllib3
import certifi

http = urllib3.PoolManager(ca_certs=certifi.where())

payload = {'name': 'Peter', 'age': 23}

url = 'https://httpbin.org/get'
req = http.request('GET', url, fields=payload)

print(req.data.decode('utf-8'))

In the example, we send a GET request with some query parameters to the
https://httpbin.org/get. The link simply returns some data back to the client, including the
query parameters. The site is used for testing HTTP requests.

payload = {'name': 'Peter', 'age': 23}

This is the payload to be sent.

req = http.request('GET', url, fields=payload)

The query parameters are specified with the fields option.

$ ./query_params.py
{
"args": {
"age": "23",
"name": "Peter"
},
"headers": {
"Accept-Encoding": "identity",
"Host": "httpbin.org",
"X-Amzn-Trace-Id": "Root=1-5e4ea45f-c3c9c721c848f8f81a3129d8"
},
"origin": "188.167.251.9",
"url": "https://httpbin.org/get?name=Peter&age=23"
}

The httpbin.org responded with a JSON string, which includes our payload as well.
Python urllib3 POST request
The HTTP POST method sends data to the server. It is often used when uploading a file or when
submitting a completed web form.

post_request.py
#!/usr/bin/env python3

import urllib3
import certifi

http = urllib3.PoolManager(ca_certs=certifi.where())

url = 'https://httpbin.org/post'

req = http.request('POST', url, fields={'name': 'John Doe'})

print(req.data.decode('utf-8'))

The example sends a POST request. The data is specified with the fields option.

$ ./post_request.py
{
"args": {},
"data": "",
"files": {},
"form": {
"name": "John Doe"
},
...
"url": "https://httpbin.org/post"
}

This is the output.

Python urllib3 send JSON

In requests, such as POST or PUT, the client tells the server what type of data is actually sent with
the Content-Type header.

send_json.py
#!/usr/bin/env python3

import urllib3
import certifi
import json

http = urllib3.PoolManager(ca_certs=certifi.where())

payload = {'name': 'John Doe'}

encoded_data = json.dumps(payload).encode('utf-8')

resp = http.request(
'POST',
'https://httpbin.org/post',
body=encoded_data,
headers={'Content-Type': 'application/json'})

data = json.loads(resp.data.decode('utf-8'))['json']
print(data)
The example sends JSON data.

payload = {'name': 'John Doe'}

encoded_data = json.dumps(payload).encode('utf-8')

We encode the JSON data into binary format.

resp = http.request(
'POST',
'https://httpbin.org/post',
body=encoded_data,
headers={'Content-Type': 'application/json'})

We specify the Content-Type header in the request.

data = json.loads(resp.data.decode('utf-8'))['json']
print(data)

We decode the returned data back to text and print it to the console.

Python urllib3 binary data

In the following example, we download binary data.

get_binary.py
#!/usr/bin/env python3

import urllib3

http = urllib3.PoolManager()

url = 'http://webcode.me/favicon.ico'
req = http.request('GET', url)
with open('favicon.ico', 'wb') as f:
f.write(req.data)

The example downloads a small icon.

with open('favicon.ico', 'wb') as f:

f.write(req.data)

The req.data is in a binary format, which we can directly write to the disk.

Python urllib3 stream data

Chunked transfer encoding is a streaming data transfer mechanism available since HTTP 1.1. In
chunked transfer encoding, the data stream is divided into a series of non-overlapping chunks.

The chunks are sent out and received independently of one another. Each chunk is preceded by its
size in bytes.

Setting preload_content to False means that urllib3 will stream the response content. The
stream() method iterates over chunks of the response content. When streaming, we should call
release_conn() to release the http connection back to the connection pool so that it can be re-
used.

streaming.py
#!/usr/bin/env python3

import urllib3
import certifi

url = "https://docs.oracle.com/javase/specs/jls/se8/jls8.pdf"

local_filename = url.split('/')[-1]

http = urllib3.PoolManager(ca_certs=certifi.where())

resp = http.request(
'GET',
url,
preload_content=False)

with open(local_filename, 'wb') as f:

for chunk in resp.stream(1024):

f.write(chunk)

resp.release_conn()

In the example, we download a PDF file.

resp = http.request(
'GET',
url,
preload_content=False)

With preload_content=False, we enable streaming.

with open(local_filename, 'wb') as f:

for chunk in resp.stream(1024):

f.write(chunk)

We iterate over the chunks of data and save them to a file.

resp.release_conn()

In the end, we release the connection.

Python urllib3 redirect

A redirect sends users and search engines to a different URL from the one they originally
requested. To follow redirects, we set the redirect option to True.

redirect.py
#!/usr/bin/env python3

import urllib3
import certifi

http = urllib3.PoolManager(ca_certs=certifi.where())

url = 'https://httpbin.org/redirect-to?url=/'
resp = http.request('GET', url, redirect=True)

print(resp.status)
print(resp.geturl())
print(resp.info())

The example follows a redirect.

$ ./redirect.py
200
/
HTTPHeaderDict({'Date': 'Fri, 21 Feb 2020 12:49:29 GMT', 'Content-Type': 'text/html;
charset=utf-8', 'Content-Length': '9593', 'Connection': 'keep-alive',
'Server': 'gunicorn/19.9.0', 'Access-Control-Allow-Origin': '*',
'Access-Control-Allow-Credentials': 'true'})

This is the output.

Python urllib3 Flask example

In the following example, we send a request to a small Flask web application. Learn more about
Flask web framework in Python Flask tutorial.

$ pip install flask

We need to install the flask module.

app.py
#!/usr/bin/env python3

from flask import Flask

from flask import request

app = Flask(__name__)

@app.route('/headers')
def hello():

ua = request.headers.get('user-agent')
ka = request.headers.get('connection')

return f'User agent: {ua}; Connection: {ka}'

The application has one route. It sends the user agent and connection header fields of a request to
the client.
send_req.py
#!/usr/bin/env python3

import urllib3

http = urllib3.PoolManager()

url = 'localhost:5000/headers'

headers = urllib3.make_headers(keep_alive=True, user_agent='Python program')

resp = http.request('GET', url, headers=headers)
print(resp.data.decode('utf-8'))

In this program, we send a request to our Flask application.

url = 'localhost:5000/headers'

Flask runs on port 5000 by default.

headers = urllib3.make_headers(keep_alive=True, user_agent='Python program')

With the make_headers() helper method, we create a headers dictionary.

resp = http.request('GET', url, headers=headers)

We send a GET request to the URL; we specify the headers dictionary.

print(resp.data.decode('utf-8'))

We print the response to the terminal.

$ export FLASK_APP=app.py
$ flask run
We run the Flask application.

$ ./send_req.py
User agent: Python program; Connection: keep-alive

From a different terminal, we launch the send_req.py program.

In this tutorial, we have worked with the Python urllib3 module.

List all Python tutorials.

Home Facebook Twitter Github Subscribe Privacy

Cybersecurity Market Update q4 2022
No ratings yet
Cybersecurity Market Update q4 2022
43 pages
Cortex Associate
No ratings yet
Cortex Associate
12 pages
AWS Certified Solutions Architect - Associate SAA-C02
No ratings yet
AWS Certified Solutions Architect - Associate SAA-C02
15 pages
Howto Urllib2
No ratings yet
Howto Urllib2
12 pages
Howto Urllib2
No ratings yet
Howto Urllib2
12 pages
Howto Urllib2
No ratings yet
Howto Urllib2
12 pages
HOWTO Fetch Internet Resources Using The Urllib Package: Guido Van Rossum and The Python Development Team
No ratings yet
HOWTO Fetch Internet Resources Using The Urllib Package: Guido Van Rossum and The Python Development Team
12 pages
Howto Urllib2
No ratings yet
Howto Urllib2
11 pages
Howto Urllib2
No ratings yet
Howto Urllib2
11 pages
HOWTO Fetch Internet Resources Using The Urllib Package: Table Des Matières
No ratings yet
HOWTO Fetch Internet Resources Using The Urllib Package: Table Des Matières
11 pages
HOWTO Fetch Internet Resources Using The Urllib Package: Guido Van Rossum and The Python Development Team
No ratings yet
HOWTO Fetch Internet Resources Using The Urllib Package: Guido Van Rossum and The Python Development Team
11 pages
Howto Urllib2
No ratings yet
Howto Urllib2
11 pages
HOWTO Fetch Internet Resources Using Urllib2: Guido Van Rossum and The Python Development Team
No ratings yet
HOWTO Fetch Internet Resources Using Urllib2: Guido Van Rossum and The Python Development Team
10 pages
Free Course of Python
No ratings yet
Free Course of Python
10 pages
Course Notes For Unit 5 of The Udacity Course CS253 Web Application Engineering
No ratings yet
Course Notes For Unit 5 of The Udacity Course CS253 Web Application Engineering
40 pages
Quickstart — Requests 2.28.1 Documentation
No ratings yet
Quickstart — Requests 2.28.1 Documentation
8 pages
Python Requests Essentials - Sample Chapter
No ratings yet
Python Requests Essentials - Sample Chapter
17 pages
HTTP Client Request
No ratings yet
HTTP Client Request
6 pages
Urllib 3
No ratings yet
Urllib 3
87 pages
Requests
No ratings yet
Requests
119 pages
Requests Readthedocs Io en Latest
No ratings yet
Requests Readthedocs Io en Latest
121 pages
Network Programming
No ratings yet
Network Programming
27 pages
Requests Tutorial
No ratings yet
Requests Tutorial
62 pages
Requests Documentation: Release 2.27.1
No ratings yet
Requests Documentation: Release 2.27.1
117 pages
Request in Python
No ratings yet
Request in Python
9 pages
Requests Documentation: Release 2.25.1
No ratings yet
Requests Documentation: Release 2.25.1
111 pages
Network Programming
No ratings yet
Network Programming
34 pages
Docs Python Requests Org en Latest
No ratings yet
Docs Python Requests Org en Latest
117 pages
Howto Urllib2 PDF
No ratings yet
Howto Urllib2 PDF
11 pages
Python Requests
No ratings yet
Python Requests
18 pages
Lecture 3 Web Crawler Basics of HTTP
No ratings yet
Lecture 3 Web Crawler Basics of HTTP
7 pages
WEB Programming
No ratings yet
WEB Programming
111 pages
HTTP Requests
No ratings yet
HTTP Requests
1 page
Requests Readthedocs Io en v2.0.0
No ratings yet
Requests Readthedocs Io en v2.0.0
78 pages
Content Security Policy
No ratings yet
Content Security Policy
46 pages
Python Module5 Notes
No ratings yet
Python Module5 Notes
36 pages
Learning Python Network Programming - Sample Chapter
No ratings yet
Learning Python Network Programming - Sample Chapter
43 pages
Guide to Sending HTTP Requests in Python With Urllib3-1
No ratings yet
Guide to Sending HTTP Requests in Python With Urllib3-1
13 pages
Rajagiri School of Engineering & Technology: Rajagiri Valley, Kakkanad, Cochin - 682 039
No ratings yet
Rajagiri School of Engineering & Technology: Rajagiri Valley, Kakkanad, Cochin - 682 039
21 pages
Hyper Text Transfer Protocol (HTTP)
No ratings yet
Hyper Text Transfer Protocol (HTTP)
21 pages
Python Programming (21EC643) (Module-5) by Prof. Sujay Gejji
No ratings yet
Python Programming (21EC643) (Module-5) by Prof. Sujay Gejji
34 pages
Web Programming
No ratings yet
Web Programming
36 pages
Python Module-5 VTU QP Solution
No ratings yet
Python Module-5 VTU QP Solution
18 pages
Quick Guide To HTTP Requests Breaking Things Down Faster With Python
No ratings yet
Quick Guide To HTTP Requests Breaking Things Down Faster With Python
10 pages
WSGI Tutorial
No ratings yet
WSGI Tutorial
6 pages
Aait-Itsc 1071 (Fundamentals of It) : Lecture 16 - Python For The Web
No ratings yet
Aait-Itsc 1071 (Fundamentals of It) : Lecture 16 - Python For The Web
29 pages
05 HTTP
No ratings yet
05 HTTP
46 pages
130 Web Applications
No ratings yet
130 Web Applications
149 pages
Homework 3: 1 Fibonacci - Py (40 Points)
No ratings yet
Homework 3: 1 Fibonacci - Py (40 Points)
5 pages
Api
No ratings yet
Api
3 pages
Pa1 Final
No ratings yet
Pa1 Final
4 pages
PY Mod5@AzDOCUMENTS - in
No ratings yet
PY Mod5@AzDOCUMENTS - in
26 pages
HTTP - Requests
No ratings yet
HTTP - Requests
5 pages
slidesgo-getting-cozy-with-pythons-httplib-and-urllib-your-new-best-buddies-for-web-requests-20241216163146egdS
No ratings yet
slidesgo-getting-cozy-with-pythons-httplib-and-urllib-your-new-best-buddies-for-web-requests-20241216163146egdS
10 pages
21AD71-module-5-textbook
No ratings yet
21AD71-module-5-textbook
22 pages
Hyper Text Transfer Protocol (HTTP) : Objective: Understand HTTP (The Protocol That Makes The Internet Possible)
No ratings yet
Hyper Text Transfer Protocol (HTTP) : Objective: Understand HTTP (The Protocol That Makes The Internet Possible)
16 pages
Topic 02
No ratings yet
Topic 02
42 pages
Third Presentation
No ratings yet
Third Presentation
2 pages
HTTP
No ratings yet
HTTP
21 pages
ibm-python-module-5-apis-data-collection
No ratings yet
ibm-python-module-5-apis-data-collection
3 pages
HTTP_in_Detail_1690850720
No ratings yet
HTTP_in_Detail_1690850720
17 pages
Boto SES + Python Twisted GitHub
No ratings yet
Boto SES + Python Twisted GitHub
20 pages
Angular HTTP: Connecting to the REST API
From Everand
Angular HTTP: Connecting to the REST API
Abdelfattah Ragab
No ratings yet
Python List Comprehensions - Learn Python List Comprehensions
No ratings yet
Python List Comprehensions - Learn Python List Comprehensions
12 pages
Python Magic Methods - Using Magic Methods in Python
No ratings yet
Python Magic Methods - Using Magic Methods in Python
18 pages
Python F-String - Formatting Strings in Python With F-String
No ratings yet
Python F-String - Formatting Strings in Python With F-String
13 pages
Python CSV - Read, Write CSV in Python
100% (1)
Python CSV - Read, Write CSV in Python
11 pages
Python Decorators - Using Decorator Functions in Python
No ratings yet
Python Decorators - Using Decorator Functions in Python
15 pages
Python Create Dictionary - Creating Dictionaries in Python
No ratings yet
Python Create Dictionary - Creating Dictionaries in Python
8 pages
Python BeautifulSoup - Parse HTML, XML Documents in Python
100% (1)
Python BeautifulSoup - Parse HTML, XML Documents in Python
21 pages
Python Click - Creating Command Line Interfaces
No ratings yet
Python Click - Creating Command Line Interfaces
19 pages
De Luyen 2
No ratings yet
De Luyen 2
12 pages
Securing Messages Using Transport Security - Microsoft Docs
No ratings yet
Securing Messages Using Transport Security - Microsoft Docs
20 pages
8051 Interrupt Types
No ratings yet
8051 Interrupt Types
4 pages
1) What Is Networking and Data Communication?
No ratings yet
1) What Is Networking and Data Communication?
12 pages
Computer Organization Assembler and Simulator CSE112 Project
No ratings yet
Computer Organization Assembler and Simulator CSE112 Project
16 pages
Network Security Administrator PDF
No ratings yet
Network Security Administrator PDF
58 pages
How Netflix Uses Analytics
No ratings yet
How Netflix Uses Analytics
3 pages
Internet Structure by Odelabi Taiwo
No ratings yet
Internet Structure by Odelabi Taiwo
18 pages
Hospital Managment System
100% (1)
Hospital Managment System
54 pages
12 Computer Science EM
No ratings yet
12 Computer Science EM
80 pages
Yllana Bay View College: Test I. Give The Definition of The Acronyms and Its Uses. (2pts)
No ratings yet
Yllana Bay View College: Test I. Give The Definition of The Acronyms and Its Uses. (2pts)
2 pages
IBH Link UA Manual PDF
No ratings yet
IBH Link UA Manual PDF
276 pages
Laravel Application Development 2021 - 1 - CSC - 006 - Subecz
No ratings yet
Laravel Application Development 2021 - 1 - CSC - 006 - Subecz
8 pages
Network Security
No ratings yet
Network Security
27 pages
2-Linux Fundamentals
No ratings yet
2-Linux Fundamentals
19 pages
Product Overview Tango Us
No ratings yet
Product Overview Tango Us
4 pages
Chapter 1: Introduction
No ratings yet
Chapter 1: Introduction
2 pages
4.2.2.3 Common Problems and Solutions For Motherboards and Internal Components
No ratings yet
4.2.2.3 Common Problems and Solutions For Motherboards and Internal Components
2 pages
Parichay Communication and Technologies Pvt. LTD.: SN o Company Address Pre S.
No ratings yet
Parichay Communication and Technologies Pvt. LTD.: SN o Company Address Pre S.
12 pages
NPTEL Lectures
No ratings yet
NPTEL Lectures
2 pages
Priya V - PLSQL Dev - Conneqt
No ratings yet
Priya V - PLSQL Dev - Conneqt
3 pages
Xpresso - Ai:: Enterprise AI Application Lifecycle Management
No ratings yet
Xpresso - Ai:: Enterprise AI Application Lifecycle Management
2 pages
Lesson 7 Communication Aids and Strategies Using Tools of Technology
No ratings yet
Lesson 7 Communication Aids and Strategies Using Tools of Technology
5 pages
Int242 Cyber Security Essentials
No ratings yet
Int242 Cyber Security Essentials
2 pages
En Epc3928s
No ratings yet
En Epc3928s
3 pages
1chap Cs Archi
No ratings yet
1chap Cs Archi
39 pages
Russian Underground Revisited
No ratings yet
Russian Underground Revisited
25 pages