Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
0% found this document useful (0 votes)
61 views

Assignment 5 Extracting Data From XML

Extracting
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as TXT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
61 views

Assignment 5 Extracting Data From XML

Extracting
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as TXT, PDF, TXT or read online on Scribd
You are on page 1/ 1

#"""Extracting Data from XML

In this assignment you will write a Python program somewhat similar to


http://www.py4e.com/code3/geoxml.py. The program will prompt for a URL, read the
XML data from that URL using urllib and then parse and extract the comment counts
from the XML data, compute the sum of the numbers in the file.

We provide two files for this assignment. One is a sample file where we give you
the sum for your testing and the other is the actual data you need to process for
the assignment.

Sample data: http://py4e-data.dr-chuck.net/comments_42.xml (Sum=2553)


Actual data: http://py4e-data.dr-chuck.net/comments_97410.xml (Sum ends with 59)
You do not need to save these files to your folder since your program will read the
data directly from the URL. Note: Each student will have a distinct data url for
the assignment - so only use your own data url for analysis."""

#Enter location: http://py4e-data.dr-chuck.net/comments_97410.xml


#Retrieving http://py4e-data.dr-chuck.net/comments_97410.xml
#Retrieved 4220 characters
#Count: 50
#Sum: 2259

import urllib.request as ur
import xml.etree.ElementTree as et

url = input('Enter location: ')


# 'http://python-data.dr-chuck.net/comments_42.xml'

total_number = 0
sum = 0

print('Retrieving', url)
xml = ur.urlopen(url).read()
print('Retrieved', len(xml), 'characters')

tree = et.fromstring(xml)
counts = tree.findall('.//count')
for count in counts:
sum += int(count.text)
total_number += 1

print('Count:', total_number)
print('Sum:', sum)

You might also like