Python for the Java Programmer - Part 3

Three more Python tricks (ok, one is about a standard python module) that I always get wrong the first time.

Reading a Web Page

import urllib2
page = urllib2.urlopen('http://www.infobart.com')
content = page.read()  

# content is now a "byte string" (ascii-like) 
# which does not support unicode. 
# Do this to properly encode your content: 
encoding=page.headers['content-type'].split('charset=')[-1] 
ucontent = unicode(content, encoding)  

# Reference: http://stackoverflow.com/questions/1020892/urllib2-read-to-unicode [/cc] 

Sorting Lists

>>> l1 = ['BBB','aaa','ccc','DDD'] 
>>> # Standard sort 
>>> sorted(l1) 
['BBB', 'DDD', 'aaa', 'ccc'] 
>>> 
>>> # Passing a function that will be applied 
>>> # to each value before sorting. Good for 
>>> # standalone function
>>> sorted(l1, key=str.lower)
['aaa', 'BBB', 'ccc', 'DDD'] 
>>> 
>>> # Using a lambda that will return the key 
>>> # for each value: better if you want to invoke 
>>> # an instance method. 
>>> sorted(l1, key=lambda s: s.lower()) 
['aaa', 'BBB', 'ccc', 'DDD'] 
>>> 
>>> # Also works with list.sort() 
>>> l1.sort(key=lambda s:s.lower()) 
>>> 
l1 ['aaa', 'BBB', 'ccc', 'DDD']

String Formatting

b1 = True 
f1 = 7.0/22.0 
i1 = 234 
s1 = 'World!'  

s = 'Hello ' + s1 + str(b1) + str(f1) + str(i1) 
# s = 'Hello World!True0.318181818182234'   

s = 'Hello %s %r %.2f %i' % (s1, b1, f1, i1) 
# s = 'Hello World! True 0.32 234'  

# Preferred way of formatting a string 
s = 'Hello {0} {1} {2:.2} {3}'.format(s1, b1, f1, i1) 
# s = 'Hello World! True 0.32 234'