Write Code Exercises

Section 13.18 Write Code Exercises

Checkpoint 13.18.1.

Complete the following code that retrieves the file ‘romeo.txt' from. Make changes to line 4 and 5.

import socket

mysock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
mysock.connect(('____', __))
cmd = '____ ________________________ HTTP/1.0\r\n\r\n'.encode()
mysock.send(cmd)

while True:
    data = mysock.recv(512)
    if len(data) < 1:
        break
    print(data.decode(),end='')

mysock.close()

Solution.

Complete the following code that retrieves the file ‘romeo.txt' from. Make changes to line 4 and 5.

import socket

mysock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
mysock.connect(('data.pr4e.org', 80))
cmd = 'GET http://data.pr4e.org/romeo.txt HTTP/1.0\r\n\r\n'.encode()
mysock.send(cmd)

while True:
    data = mysock.recv(512)
    if len(data) < 1:
        break
    print(data.decode(),end='')

    mysock.close()

Checkpoint 13.18.2.

Complete the following code to extract an image ‘cover3.jpg' from the URL ‘http://data.pr4e.org/cover3.jpg ¹' and host ‘data.pr4e.org'. There are 5 empty spaces.

import socket

HOST = '________'
PORT = 80
mysock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
mysock.connect((____, ____))
mysock.sendall(b'___ __________________ HTTP/1.0\r\n\r\n')
count = 0
picture = b""

while True:
    data = mysock.recv(5120)
    if len(data) < 1: break
    #time.sleep(0.25)
    count = count + len(data)
    print(len(data), count)
    picture = picture + data

mysock.close()

pos = picture.find(b"\r\n\r\n")
picture = picture[pos+4:]
fhand = open("stuff.jpg", "wb")
fhand.write(picture)
fhand.close()

Checkpoint 13.18.3.

Complete the following code that retrieves the text from ‘http://data.pr4e.org/clown.txt ²', prints it and also prints the frequency of each word.

import urllib.request

fhand = urllib.request.urlopen('_________________')

Solution.

Complete the following code that retrieves the text from ‘http://data.pr4e.org/clown.txt' and prints the frequency of each word.

import urllib.request

fhand = urllib.request.urlopen('http://data.pr4e.org/clown.txt')
for line in fhand:
    words = line.decode().strip()
    print(words)
    for word in words:
        counts[word] = counts.get(word, 0) + 1
print(counts)

Checkpoint 13.18.4.

Write a program to retrieve and print text from “http://data.pr4e.org/intro-short.txt ³” and print it.

import urllib.request

Checkpoint 13.18.5.

Write a program to store image file from ‘http://data.pr4e.org/cover.jpg ⁴' to your disk.

import urllib.request, urllib.parse, urllib.error

Solution.

Write a program to store image file from ‘http://data.pr4e.org/cover.jpg' to your disk.

import urllib.request, urllib.parse, urllib.error

img = urllib.request.urlopen('http://data.pr4e.org/cover.jpg').read()
fhand = open('cover.jpg', 'wb')
fhand.write(img)
fhand.close()

Checkpoint 13.18.6.

Complete the following program to extract all url from the webpage using regex.

import urllib.request, urllib.parse, urllib.error
import re

url = "https://www.nytimes.com"
html = _______________________
links = _______(b'href="(http[s]?://.*?)"', html)

Checkpoint 13.18.7.

Write a program that retrives a txt file from ‘https://www.gutenberg.org/files/1342/1342-0.txt ⁵' in several blocks of 100,000 characters, joins them and saves as ‘prideandprejudice.txt' to disk and prints number of characters.

import urllib.request, urllib.parse, urllib.error

txt = urllib.request.urlopen('___________________')

size = 0
while True:
    info = txt.read(100000)
    if len(info) < 1: break
    size = size + len(info)

Solution.

Write a program that retrives a txt file from ‘https://www.gutenberg.org/files/1342/1342-0.txt' in several blocks of 100,000 characters, joins them and saves as ‘prideandprejudice.txt' to disk and prints number of characters.

import urllib.request, urllib.parse, urllib.error

txt = urllib.request.urlopen('https://www.gutenberg.org/files/1342/1342-0.txt')
fhand = open('prideandprejudice.txt', 'wb')
size = 0
while True:
    info = txt.read(100000)
    if len(info) < 1: break
    size = size + len(info)
    fhand.write(info)

print(size, 'characters copied.')
fhand.close()

Checkpoint 13.18.8.

Write a program that retrives a txt file from ‘https://www.gutenberg.org/files/16/16-0.txt ⁶' in several blocks of 100,000 characters, joins them and saves as ‘peterpan.txt' to disk and prints number of characters.

import urllib.request, urllib.parse, urllib.error

Checkpoint 13.18.9.

Complete the following code to print all the image sources from the webpage. Use ‘img' and ‘src' as tags.

import requests
from bs4 import BeautifulSoup

url = "https://www.nytimes.com/"
resp = requests.get(url)
soup = BeautifulSoup(resp.content, 'html.parser')

Solution.

Complete the following code to print all the image sources from the webpage. Use ‘img' and ‘src' as tags.

import requests
from bs4 import BeautifulSoup

url = "https://www.nytimes.com/"
resp = requests.get(url)
soup = BeautifulSoup(resp.content, 'html.parser')

tags = soup('img')
for tag in tags:
    print(tag.get('src', None))

Checkpoint 13.18.10.

Write code that extracts data from several parts of the ‘a' tag from “http://www.dr-chuck.com/page1.htm ⁷” using BeautifulSoup and html.parser and print the tag, href, contents as well as all the attributes.

import requests
from bs4 import BeautifulSoup

http://data.pr4e.org/cover3.jpg

http://data.pr4e.org/clown.txt

http://data.pr4e.org/intro-short.txt

http://data.pr4e.org/cover.jpg

https://www.gutenberg.org/files/1342/1342-0.txt

https://www.gutenberg.org/files/16/16-0.txt

http://www.dr-chuck.com/page1.htm

Python for Everyone: The PreTeXt Interactive Edition

Search Results:

Section 13.18 Write Code Exercises

Checkpoint 13.18.1.

Checkpoint 13.18.2.

Checkpoint 13.18.3.

Checkpoint 13.18.4.

Checkpoint 13.18.5.

Checkpoint 13.18.6.

Checkpoint 13.18.7.

Checkpoint 13.18.8.

Checkpoint 13.18.9.

Checkpoint 13.18.10.