×

# Welcome to Knowledge Base!

## KB at your finger tips

This is one stop global knowledge base where you can learn about all the products, solutions and support features.

Categories
All
Programming-Python
10. Brief Tour of the Standard Library

# 10. Brief Tour of the Standard LibraryÂ¶

## 10.1. Operating System InterfaceÂ¶

The os module provides dozens of functions for interacting with the
operating system:

>>> import os>>> os.getcwd()      # Return the current working directory'C:\\Python311'>>> os.chdir('/server/accesslogs')   # Change current working directory>>> os.system('mkdir today')   # Run the command mkdir in the system shell0

Be sure to use the import os style instead of from os import *. This
will keep os.open() from shadowing the built-in open() function which
operates much differently.

The built-in dir() and help() functions are useful as interactive
aids for working with large modules like os:

>>> import os>>> dir(os)<returns a list of all module functions>>>> help(os)<returns an extensive manual page created from the module's docstrings>

For daily file and directory management tasks, the shutil module provides
a higher level interface that is easier to use:

>>> import shutil>>> shutil.copyfile('data.db', 'archive.db')'archive.db'>>> shutil.move('/build/executables', 'installdir')'installdir'

## 10.2. File WildcardsÂ¶

The glob module provides a function for making file lists from directory
wildcard searches:

>>> import glob>>> glob.glob('*.py')['primes.py', 'random.py', 'quote.py']

## 10.3. Command Line ArgumentsÂ¶

Common utility scripts often need to process command line arguments. These
arguments are stored in the sys moduleâs argv attribute as a list. For
instance the following output results from running python demo.py one twothree at the command line:

>>> import sys>>> print(sys.argv)['demo.py', 'one', 'two', 'three']

The argparse module provides a more sophisticated mechanism to process
command line arguments. The following script extracts one or more filenames
and an optional number of lines to be displayed:

import argparseparser = argparse.ArgumentParser(    prog='top',    description='Show top lines from each file')parser.add_argument('filenames', nargs='+')parser.add_argument('-l', '--lines', type=int, default=10)args = parser.parse_args()print(args)

When run at the command line with python top.py --lines=5 alpha.txtbeta.txt, the script sets args.lines to 5 and args.filenames
to ['alpha.txt', 'beta.txt'].

## 10.4. Error Output Redirection and Program TerminationÂ¶

The sys module also has attributes for stdin, stdout, and stderr.
The latter is useful for emitting warnings and error messages to make them
visible even when stdout has been redirected:

>>> sys.stderr.write('Warning, log file not found starting a new one\n')Warning, log file not found starting a new one

The most direct way to terminate a script is to use sys.exit().

## 10.5. String Pattern MatchingÂ¶

The re module provides regular expression tools for advanced string
processing. For complex matching and manipulation, regular expressions offer
succinct, optimized solutions:

>>> import re>>> re.findall(r'\bf[a-z]*', 'which foot or hand fell fastest')['foot', 'fell', 'fastest']>>> re.sub(r'(\b[a-z]+) \1', r'\1', 'cat in the the hat')'cat in the hat'

When only simple capabilities are needed, string methods are preferred because
they are easier to read and debug:

>>> 'tea for too'.replace('too', 'two')'tea for two'

## 10.6. MathematicsÂ¶

The math module gives access to the underlying C library functions for
floating point math:

>>> import math>>> math.cos(math.pi / 4)0.70710678118654757>>> math.log(1024, 2)10.0

The random module provides tools for making random selections:

>>> import random>>> random.choice(['apple', 'pear', 'banana'])'apple'>>> random.sample(range(100), 10)   # sampling without replacement[30, 83, 16, 4, 8, 81, 41, 50, 18, 33]>>> random.random()    # random float0.17970987693706186>>> random.randrange(6)    # random integer chosen from range(6)4

The statistics module calculates basic statistical properties
(the mean, median, variance, etc.) of numeric data:

>>> import statistics>>> data = [2.75, 1.75, 1.25, 0.25, 0.5, 1.25, 3.5]>>> statistics.mean(data)1.6071428571428572>>> statistics.median(data)1.25>>> statistics.variance(data)1.3720238095238095

The SciPy project <https://scipy.org> has many other modules for numerical
computations.

## 10.7. Internet AccessÂ¶

There are a number of modules for accessing the internet and processing internet
protocols. Two of the simplest are urllib.request for retrieving data
from URLs and smtplib for sending mail:

>>> from urllib.request import urlopen>>> with urlopen('http://worldtimeapi.org/api/timezone/etc/UTC.txt') as response:...     for line in response:...         line = line.decode()             # Convert bytes to a str...         if line.startswith('datetime'):...             print(line.rstrip())         # Remove trailing newline...datetime: 2022-01-01T01:36:47.689215+00:00>>> import smtplib>>> server = smtplib.SMTP('localhost')>>> server.sendmail('soothsayer@example.org', 'jcaesar@example.org',... """To: jcaesar@example.org... From: soothsayer@example.org...... Beware the Ides of March.... """)>>> server.quit()

(Note that the second example needs a mailserver running on localhost.)

## 10.8. Dates and TimesÂ¶

The datetime module supplies classes for manipulating dates and times in
both simple and complex ways. While date and time arithmetic is supported, the
focus of the implementation is on efficient member extraction for output
formatting and manipulation. The module also supports objects that are timezone
aware.

>>> # dates are easily constructed and formatted>>> from datetime import date>>> now = date.today()>>> nowdatetime.date(2003, 12, 2)>>> now.strftime("%m-%d-%y. %d %b %Y is a %A on the %d day of %B.")'12-02-03. 02 Dec 2003 is a Tuesday on the 02 day of December.'>>> # dates support calendar arithmetic>>> birthday = date(1964, 7, 31)>>> age = now - birthday>>> age.days14368

## 10.9. Data CompressionÂ¶

Common data archiving and compression formats are directly supported by modules
including: zlib, gzip, bz2, lzma, zipfile and
tarfile.

>>> import zlib>>> s = b'witch which has which witches wrist watch'>>> len(s)41>>> t = zlib.compress(s)>>> len(t)37>>> zlib.decompress(t)b'witch which has which witches wrist watch'>>> zlib.crc32(s)226805979

## 10.10. Performance MeasurementÂ¶

Some Python users develop a deep interest in knowing the relative performance of
different approaches to the same problem. Python provides a measurement tool

For example, it may be tempting to use the tuple packing and unpacking feature
instead of the traditional approach to swapping arguments. The timeit
module quickly demonstrates a modest performance advantage:

>>> from timeit import Timer>>> Timer('t=a; a=b; b=t', 'a=1; b=2').timeit()0.57535828626024577>>> Timer('a,b = b,a', 'a=1; b=2').timeit()0.54962537085770791

In contrast to timeitâs fine level of granularity, the profile and
pstats modules provide tools for identifying time critical sections in
larger blocks of code.

## 10.11. Quality ControlÂ¶

One approach for developing high quality software is to write tests for each
function as it is developed and to run those tests frequently during the
development process.

The doctest module provides a tool for scanning a module and validating
tests embedded in a programâs docstrings. Test construction is as simple as
cutting-and-pasting a typical call along with its results into the docstring.
This improves the documentation by providing the user with an example and it
allows the doctest module to make sure the code remains true to the
documentation:

def average(values):    """Computes the arithmetic mean of a list of numbers.    >>> print(average([20, 30, 70]))    40.0    """    return sum(values) / len(values)import doctestdoctest.testmod()   # automatically validate the embedded tests

The unittest module is not as effortless as the doctest module,
but it allows a more comprehensive set of tests to be maintained in a separate
file:

import unittestclass TestStatisticalFunctions(unittest.TestCase):    def test_average(self):        self.assertEqual(average([20, 30, 70]), 40.0)        self.assertEqual(round(average([1, 5, 7]), 1), 4.3)        with self.assertRaises(ZeroDivisionError):            average([])        with self.assertRaises(TypeError):            average(20, 30, 70)unittest.main()  # Calling from the command line invokes all tests

## 10.12. Batteries IncludedÂ¶

Python has a âbatteries includedâ philosophy. This is best seen through the
sophisticated and robust capabilities of its larger packages. For example:

11. Brief Tour of the Standard Library â Part II

# 11. Brief Tour of the Standard Library â Part IIÂ¶

This second tour covers more advanced modules that support professional
programming needs. These modules rarely occur in small scripts.

## 11.1. Output FormattingÂ¶

The reprlib module provides a version of repr() customized for
abbreviated displays of large or deeply nested containers:

>>> import reprlib>>> reprlib.repr(set('supercalifragilisticexpialidocious'))"{'a', 'c', 'd', 'e', 'f', 'g', ...}"

The pprint module offers more sophisticated control over printing both
built-in and user defined objects in a way that is readable by the interpreter.
When the result is longer than one line, the âpretty printerâ adds line breaks
and indentation to more clearly reveal data structure:

>>> import pprint>>> t = [[[['black', 'cyan'], 'white', ['green', 'red']], [['magenta',...     'yellow'], 'blue']]]...>>> pprint.pprint(t, width=30)[[[['black', 'cyan'],   'white',   ['green', 'red']],  [['magenta', 'yellow'],   'blue']]]

The textwrap module formats paragraphs of text to fit a given screen
width:

>>> import textwrap>>> doc = """The wrap() method is just like fill() except that it returns... a list of strings instead of one big string with newlines to separate... the wrapped lines."""...>>> print(textwrap.fill(doc, width=40))The wrap() method is just like fill()except that it returns a list of stringsinstead of one big string with newlinesto separate the wrapped lines.

The locale module accesses a database of culture specific data formats.
The grouping attribute of localeâs format function provides a direct way of
formatting numbers with group separators:

>>> import locale>>> locale.setlocale(locale.LC_ALL, 'English_United States.1252')'English_United States.1252'>>> conv = locale.localeconv()          # get a mapping of conventions>>> x = 1234567.8>>> locale.format("%d", x, grouping=True)'1,234,567'>>> locale.format_string("%s%.*f", (conv['currency_symbol'],...                      conv['frac_digits'], x), grouping=True)'$1,234,567.80' ## 11.2. TemplatingÂ¶ The string module includes a versatile Template class with a simplified syntax suitable for editing by end-users. This allows users to customize their applications without having to alter the application. The format uses placeholder names formed by $ with valid Python identifiers
(alphanumeric characters and underscores). Surrounding the placeholder with
braces allows it to be followed by more alphanumeric letters with no intervening
spaces. Writing $$ creates a single escaped : >>> from string import Template>>> t = Template('{village}folk send$$10 to $cause.')>>> t.substitute(village='Nottingham', cause='the ditch fund')'Nottinghamfolk send$10 to the ditch fund.'

The substitute() method raises a KeyError when a
placeholder is not supplied in a dictionary or a keyword argument. For
mail-merge style applications, user supplied data may be incomplete and the
safe_substitute() method may be more appropriate â
it will leave placeholders unchanged if data is missing:

>>> t = Template('Return the $item to$owner.')>>> d = dict(item='unladen swallow')>>> t.substitute(d)Traceback (most recent call last):  ...KeyError: 'owner'>>> t.safe_substitute(d)'Return the unladen swallow to $owner.' Template subclasses can specify a custom delimiter. For example, a batch renaming utility for a photo browser may elect to use percent signs for placeholders such as the current date, image sequence number, or file format: >>> import time, os.path>>> photofiles = ['img_1074.jpg', 'img_1076.jpg', 'img_1077.jpg']>>> class BatchRename(Template):... delimiter = '%'...>>> fmt = input('Enter rename style (%d-date %n-seqnum %f-format): ')Enter rename style (%d-date %n-seqnum %f-format): Ashley_%n%f>>> t = BatchRename(fmt)>>> date = time.strftime('%d%b%y')>>> for i, filename in enumerate(photofiles):... base, ext = os.path.splitext(filename)... newname = t.substitute(d=date, n=i, f=ext)... print('{0} --> {1}'.format(filename, newname))img_1074.jpg --> Ashley_0.jpgimg_1076.jpg --> Ashley_1.jpgimg_1077.jpg --> Ashley_2.jpg Another application for templating is separating program logic from the details of multiple output formats. This makes it possible to substitute custom templates for XML files, plain text reports, and HTML web reports. ## 11.3. Working with Binary Data Record LayoutsÂ¶ The struct module provides pack() and unpack() functions for working with variable length binary record formats. The following example shows how to loop through header information in a ZIP file without using the zipfile module. Pack codes "H" and "I" represent two and four byte unsigned numbers respectively. The "<" indicates that they are standard size and in little-endian byte order: import structwith open('myfile.zip', 'rb') as f: data = f.read()start = 0for i in range(3): # show the first 3 file headers start += 14 fields = struct.unpack('<IIIHH', data[start:start+16]) crc32, comp_size, uncomp_size, filenamesize, extra_size = fields start += 16 filename = data[start:start+filenamesize] start += filenamesize extra = data[start:start+extra_size] print(filename, hex(crc32), comp_size, uncomp_size) start += extra_size + comp_size # skip to the next header ## 11.4. Multi-threadingÂ¶ Threading is a technique for decoupling tasks which are not sequentially dependent. Threads can be used to improve the responsiveness of applications that accept user input while other tasks run in the background. A related use case is running I/O in parallel with computations in another thread. The following code shows how the high level threading module can run tasks in background while the main program continues to run: import threading, zipfileclass AsyncZip(threading.Thread): def __init__(self, infile, outfile): threading.Thread.__init__(self) self.infile = infile self.outfile = outfile def run(self): f = zipfile.ZipFile(self.outfile, 'w', zipfile.ZIP_DEFLATED) f.write(self.infile) f.close() print('Finished background zip of:', self.infile)background = AsyncZip('mydata.txt', 'myarchive.zip')background.start()print('The main program continues to run in foreground.')background.join() # Wait for the background task to finishprint('Main program waited until background was done.') The principal challenge of multi-threaded applications is coordinating threads that share data or other resources. To that end, the threading module provides a number of synchronization primitives including locks, events, condition variables, and semaphores. While those tools are powerful, minor design errors can result in problems that are difficult to reproduce. So, the preferred approach to task coordination is to concentrate all access to a resource in a single thread and then use the queue module to feed that thread with requests from other threads. Applications using Queue objects for inter-thread communication and coordination are easier to design, more readable, and more reliable. ## 11.5. LoggingÂ¶ The logging module offers a full featured and flexible logging system. At its simplest, log messages are sent to a file or to sys.stderr: import logginglogging.debug('Debugging information')logging.info('Informational message')logging.warning('Warning:config file %s not found', 'server.conf')logging.error('Error occurred')logging.critical('Critical error -- shutting down') This produces the following output: WARNING:root:Warning:config file server.conf not foundERROR:root:Error occurredCRITICAL:root:Critical error -- shutting down By default, informational and debugging messages are suppressed and the output is sent to standard error. Other output options include routing messages through email, datagrams, sockets, or to an HTTP Server. New filters can select different routing based on message priority: DEBUG, INFO, WARNING, ERROR, and CRITICAL. The logging system can be configured directly from Python or can be loaded from a user editable configuration file for customized logging without altering the application. ## 11.6. Weak ReferencesÂ¶ Python does automatic memory management (reference counting for most objects and garbage collection to eliminate cycles). The memory is freed shortly after the last reference to it has been eliminated. This approach works fine for most applications but occasionally there is a need to track objects only as long as they are being used by something else. Unfortunately, just tracking them creates a reference that makes them permanent. The weakref module provides tools for tracking objects without creating a reference. When the object is no longer needed, it is automatically removed from a weakref table and a callback is triggered for weakref objects. Typical applications include caching objects that are expensive to create: >>> import weakref, gc>>> class A:... def __init__(self, value):... self.value = value... def __repr__(self):... return str(self.value)...>>> a = A(10) # create a reference>>> d = weakref.WeakValueDictionary()>>> d['primary'] = a # does not create a reference>>> d['primary'] # fetch the object if it is still alive10>>> del a # remove the one reference>>> gc.collect() # run garbage collection right away0>>> d['primary'] # entry was automatically removedTraceback (most recent call last): File "<stdin>", line 1, in <module> d['primary'] # entry was automatically removed File "C:/python311/lib/weakref.py", line 46, in __getitem__ o = self.data[key]()KeyError: 'primary' ## 11.7. Tools for Working with ListsÂ¶ Many data structure needs can be met with the built-in list type. However, sometimes there is a need for alternative implementations with different performance trade-offs. The array module provides an array() object that is like a list that stores only homogeneous data and stores it more compactly. The following example shows an array of numbers stored as two byte unsigned binary numbers (typecode "H") rather than the usual 16 bytes per entry for regular lists of Python int objects: >>> from array import array>>> a = array('H', [4000, 10, 700, 22222])>>> sum(a)26932>>> a[1:3]array('H', [10, 700]) The collections module provides a deque() object that is like a list with faster appends and pops from the left side but slower lookups in the middle. These objects are well suited for implementing queues and breadth first tree searches: >>> from collections import deque>>> d = deque(["task1", "task2", "task3"])>>> d.append("task4")>>> print("Handling", d.popleft())Handling task1 unsearched = deque([starting_node])def breadth_first_search(unsearched): node = unsearched.popleft() for m in gen_moves(node): if is_goal(m): return m unsearched.append(m) In addition to alternative list implementations, the library also offers other tools such as the bisect module with functions for manipulating sorted lists: >>> import bisect>>> scores = [(100, 'perl'), (200, 'tcl'), (400, 'lua'), (500, 'python')]>>> bisect.insort(scores, (300, 'ruby'))>>> scores[(100, 'perl'), (200, 'tcl'), (300, 'ruby'), (400, 'lua'), (500, 'python')] The heapq module provides functions for implementing heaps based on regular lists. The lowest valued entry is always kept at position zero. This is useful for applications which repeatedly access the smallest element but do not want to run a full list sort: >>> from heapq import heapify, heappop, heappush>>> data = [1, 3, 5, 7, 9, 2, 4, 6, 8, 0]>>> heapify(data) # rearrange the list into heap order>>> heappush(data, -5) # add a new entry>>> [heappop(data) for i in range(3)] # fetch the three smallest entries[-5, 0, 1] ## 11.8. Decimal Floating Point ArithmeticÂ¶ The decimal module offers a Decimal datatype for decimal floating point arithmetic. Compared to the built-in float implementation of binary floating point, the class is especially helpful for • financial applications and other uses which require exact decimal representation, • control over precision, • control over rounding to meet legal or regulatory requirements, • tracking of significant decimal places, or • applications where the user expects the results to match calculations done by hand. For example, calculating a 5% tax on a 70 cent phone charge gives different results in decimal floating point and binary floating point. The difference becomes significant if the results are rounded to the nearest cent: >>> from decimal import *>>> round(Decimal('0.70') * Decimal('1.05'), 2)Decimal('0.74')>>> round(.70 * 1.05, 2)0.73 The Decimal result keeps a trailing zero, automatically inferring four place significance from multiplicands with two place significance. Decimal reproduces mathematics as done by hand and avoids issues that can arise when binary floating point cannot exactly represent decimal quantities. Exact representation enables the Decimal class to perform modulo calculations and equality tests that are unsuitable for binary floating point: >>> Decimal('1.00') % Decimal('.10')Decimal('0.00')>>> 1.00 % 0.100.09999999999999995>>> sum([Decimal('0.1')]*10) == Decimal('1.0')True>>> sum([0.1]*10) == 1.0False The decimal module provides arithmetic with as much precision as needed: >>> getcontext().prec = 36>>> Decimal(1) / Decimal(7)Decimal('0.142857142857142857142857142857142857') 12. Virtual Environments and Packages # 12. Virtual Environments and PackagesÂ¶ ## 12.1. IntroductionÂ¶ Python applications will often use packages and modules that donât come as part of the standard library. Applications will sometimes need a specific version of a library, because the application may require that a particular bug has been fixed or the application may be written using an obsolete version of the libraryâs interface. This means it may not be possible for one Python installation to meet the requirements of every application. If application A needs version 1.0 of a particular module but application B needs version 2.0, then the requirements are in conflict and installing either version 1.0 or 2.0 will leave one application unable to run. The solution for this problem is to create a virtual environment, a self-contained directory tree that contains a Python installation for a particular version of Python, plus a number of additional packages. Different applications can then use different virtual environments. To resolve the earlier example of conflicting requirements, application A can have its own virtual environment with version 1.0 installed while application B has another virtual environment with version 2.0. If application B requires a library be upgraded to version 3.0, this will not affect application Aâs environment. ## 12.2. Creating Virtual EnvironmentsÂ¶ The module used to create and manage virtual environments is called venv. venv will usually install the most recent version of Python that you have available. If you have multiple versions of Python on your system, you can select a specific Python version by running python3 or whichever version you want. To create a virtual environment, decide upon a directory where you want to place it, and run the venv module as a script with the directory path: python3 -m venv tutorial-env This will create the tutorial-env directory if it doesnât exist, and also create directories inside it containing a copy of the Python interpreter and various supporting files. A common directory location for a virtual environment is .venv. This name keeps the directory typically hidden in your shell and thus out of the way while giving it a name that explains why the directory exists. It also prevents clashing with .env environment variable definition files that some tooling supports. Once youâve created a virtual environment, you may activate it. On Windows, run: tutorial-env\Scripts\activate.bat On Unix or MacOS, run: source tutorial-env/bin/activate (This script is written for the bash shell. If you use the csh or fish shells, there are alternate activate.csh and activate.fish scripts you should use instead.) Activating the virtual environment will change your shellâs prompt to show what virtual environment youâre using, and modify the environment so that running python will get you that particular version and installation of Python. For example: $ source ~/envs/tutorial-env/bin/activate(tutorial-env) $pythonPython 3.5.1 (default, May 6 2016, 10:59:36) ...>>> import sys>>> sys.path['', '/usr/local/lib/python35.zip', ...,'~/envs/tutorial-env/lib/python3.5/site-packages']>>> To deactivate a virtual environment, type: deactivate into the terminal. ## 12.3. Managing Packages with pipÂ¶ You can install, upgrade, and remove packages using a program called pip. By default pip will install packages from the Python Package Index, <https://pypi.org>. You can browse the Python Package Index by going to it in your web browser. pip has a number of subcommands: âinstallâ, âuninstallâ, âfreezeâ, etc. (Consult the Installing Python Modules guide for complete documentation for pip.) You can install the latest version of a package by specifying a packageâs name: (tutorial-env)$ python -m pip install novasCollecting novas  Downloading novas-3.1.1.3.tar.gz (136kB)Installing collected packages: novas  Running setup.py install for novasSuccessfully installed novas-3.1.1.3

You can also install a specific version of a package by giving the
package name followed by == and the version number:

(tutorial-env) $python -m pip install requests==2.6.0Collecting requests==2.6.0 Using cached requests-2.6.0-py2.py3-none-any.whlInstalling collected packages: requestsSuccessfully installed requests-2.6.0 If you re-run this command, pip will notice that the requested version is already installed and do nothing. You can supply a different version number to get that version, or you can run python-m pip install --upgrade to upgrade the package to the latest version: (tutorial-env)$ python -m pip install --upgrade requestsCollecting requestsInstalling collected packages: requests  Found existing installation: requests 2.6.0    Uninstalling requests-2.6.0:      Successfully uninstalled requests-2.6.0Successfully installed requests-2.7.0

python -m pip uninstall followed by one or more package names will
remove the packages from the virtual environment.

python -m pip show will display information about a particular package:

(tutorial-env) $python -m pip show requests---Metadata-Version: 2.0Name: requestsVersion: 2.7.0Summary: Python HTTP for Humans.Home-page: http://python-requests.orgAuthor: Kenneth ReitzAuthor-email: me@kennethreitz.comLicense: Apache 2.0Location: /Users/akuchling/envs/tutorial-env/lib/python3.4/site-packagesRequires: python -m pip list will display all of the packages installed in the virtual environment: (tutorial-env)$ python -m pip listnovas (3.1.1.3)numpy (1.9.2)pip (7.0.3)requests (2.7.0)setuptools (16.0)

python -m pip freeze will produce a similar list of the installed packages,
but the output uses the format that python -m pip install expects.
A common convention is to put this list in a requirements.txt file:

(tutorial-env) $python -m pip freeze > requirements.txt(tutorial-env)$ cat requirements.txtnovas==3.1.1.3numpy==1.9.2requests==2.7.0

The requirements.txt can then be committed to version control and
shipped as part of an application. Users can then install all the
necessary packages with install -r:

(tutorial-env) \$ python -m pip install -r requirements.txtCollecting novas==3.1.1.3 (from -r requirements.txt (line 1))  ...Collecting numpy==1.9.2 (from -r requirements.txt (line 2))  ...Collecting requests==2.7.0 (from -r requirements.txt (line 3))  ...Installing collected packages: novas, numpy, requests  Running setup.py install for novasSuccessfully installed novas-3.1.1.3 numpy-1.9.2 requests-2.7.0

pip has many more options. Consult the Installing Python Modules
guide for complete documentation for pip. When youâve written
a package and want to make it available on the Python Package Index,
consult the Distributing Python Modules guide.

13. What Now?

# 13. What Now?Â¶

Reading this tutorial has probably reinforced your interest in using Python â
you should be eager to apply Python to solving your real-world problems. Where

This tutorial is part of Pythonâs documentation set. Some other documents in
the set are:

• You should browse through this manual, which gives complete (though terse)
reference material about types, functions, and the modules in the standard
library. The standard Python distribution includes a lot of additional code.
There are modules to read Unix mailboxes, retrieve documents via HTTP, generate
random numbers, parse command-line options, compress data,
and many other tasks. Skimming through the Library Reference will give you an
idea of whatâs available.

• Installing Python Modules explains how to install additional modules written
by other Python users.

• The Python Language Reference: A detailed explanation of Pythonâs syntax and
semantics. Itâs heavy reading, but is useful as a complete guide to the
language itself.

More Python resources:

• https://www.python.org: The major Python web site. It contains code,
documentation, and pointers to Python-related pages around the web.

• https://pypi.org: The Python Package Index, previously also nicknamed
the Cheese Shop 1, is an index of user-created Python modules that are available
for download. Once you begin releasing code, you can register it here so that
others can find it.

• https://code.activestate.com/recipes/langs/python/: The Python Cookbook is a
sizable collection of code examples, larger modules, and useful scripts.
Particularly notable contributions are collected in a book also titled Python
Cookbook (OâReilly & Associates, ISBN 0-596-00797-3.)

• https://pyvideo.org collects links to Python-related videos from
conferences and user-group meetings.

• https://scipy.org: The Scientific Python project includes modules for fast
array computations and manipulations plus a host of packages for such
things as linear algebra, Fourier transforms, non-linear solvers,
random number distributions, statistical analysis and the like.

For Python-related questions and problem reports, you can post to the newsgroup
comp.lang.python, or send them to the mailing list at
python-list@python.org. The newsgroup and mailing list are gatewayed, so
messages posted to one will automatically be forwarded to the other. There are
hundreds of postings a day, asking (and
answering) questions, suggesting new features, and announcing new modules.
Mailing list archives are available at https://mail.python.org/pipermail/.

Before posting, be sure to check the list of
Frequently Asked Questions (also called the FAQ). The
FAQ answers many of the questions that come up again and again, and may

Footnotes

1

âCheese Shopâ is a Monty Pythonâs sketch: a customer enters a cheese shop,
but whatever cheese he asks for, the clerk says itâs missing.

14. Interactive Input Editing and History Substitution

# 14. Interactive Input Editing and History SubstitutionÂ¶

Some versions of the Python interpreter support editing of the current input
line and history substitution, similar to facilities found in the Korn shell and
the GNU Bash shell. This is implemented using the GNU Readline library,
which supports various styles of editing. This library has its own
documentation which we wonât duplicate here.

## 14.1. Tab Completion and History EditingÂ¶

Completion of variable and module names is
automatically enabled at interpreter startup so
that the Tab key invokes the completion function; it looks at
Python statement names, the current local variables, and the available
module names. For dotted expressions such as string.a, it will evaluate
the expression up to the final '.' and then suggest completions from
the attributes of the resulting object. Note that this may execute
application-defined code if an object with a __getattr__() method
is part of the expression. The default configuration also saves your
history into a file named .python_history in your user directory.
The history will be available again during the next interactive interpreter
session.

## 14.2. Alternatives to the Interactive InterpreterÂ¶

This facility is an enormous step forward compared to earlier versions of the
interpreter; however, some wishes are left: It would be nice if the proper
indentation were suggested on continuation lines (the parser knows if an indent
token is required next). The completion mechanism might use the interpreterâs
symbol table. A command to check (or even suggest) matching parentheses,
quotes, etc., would also be useful.

One alternative enhanced interactive interpreter that has been around for quite
some time is IPython, which features tab completion, object exploration and
advanced history management. It can also be thoroughly customized and embedded
into other applications. Another similar enhanced interactive environment is
bpython.

15. Floating Point Arithmetic: Issues and Limitations

# 15. Floating Point Arithmetic: Issues and LimitationsÂ¶

Floating-point numbers are represented in computer hardware as base 2 (binary)
fractions. For example, the decimal fraction 0.125
has value 1/10 + 2/100 + 5/1000, and in the same way the binary fraction 0.001
has value 0/2 + 0/4 + 1/8. These two fractions have identical values, the only
real difference being that the first is written in base 10 fractional notation,
and the second in base 2.

Unfortunately, most decimal fractions cannot be represented exactly as binary
fractions. A consequence is that, in general, the decimal floating-point
numbers you enter are only approximated by the binary floating-point numbers
actually stored in the machine.

The problem is easier to understand at first in base 10. Consider the fraction
1/3. You can approximate that as a base 10 fraction:

0.3

or, better,

0.33

or, better,

0.333

and so on. No matter how many digits youâre willing to write down, the result
will never be exactly 1/3, but will be an increasingly better approximation of
1/3.

In the same way, no matter how many base 2 digits youâre willing to use, the
decimal value 0.1 cannot be represented exactly as a base 2 fraction. In base
2, 1/10 is the infinitely repeating fraction

0.0001100110011001100110011001100110011001100110011...

Stop at any finite number of bits, and you get an approximation. On most
machines today, floats are approximated using a binary fraction with
the numerator using the first 53 bits starting with the most significant bit and
with the denominator as a power of two. In the case of 1/10, the binary fraction
is 3602879701896397 / 2 ** 55 which is close to but not exactly
equal to the true value of 1/10.

Many users are not aware of the approximation because of the way values are
displayed. Python only prints a decimal approximation to the true decimal
value of the binary approximation stored by the machine. On most machines, if
Python were to print the true decimal value of the binary approximation stored
for 0.1, it would have to display

>>> 0.10.1000000000000000055511151231257827021181583404541015625

That is more digits than most people find useful, so Python keeps the number
of digits manageable by displaying a rounded value instead

>>> 1 / 100.1

Just remember, even though the printed result looks like the exact value
of 1/10, the actual stored value is the nearest representable binary fraction.

Interestingly, there are many different decimal numbers that share the same
nearest approximate binary fraction. For example, the numbers 0.1 and
0.10000000000000001 and
0.1000000000000000055511151231257827021181583404541015625 are all
approximated by 3602879701896397 / 2 ** 55. Since all of these decimal
values share the same approximation, any one of them could be displayed
while still preserving the invariant eval(repr(x)) == x.

Historically, the Python prompt and built-in repr() function would choose
the one with 17 significant digits, 0.10000000000000001. Starting with
Python 3.1, Python (on most systems) is now able to choose the shortest of
these and simply display 0.1.

Note that this is in the very nature of binary floating-point: this is not a bug
in Python, and it is not a bug in your code either. Youâll see the same kind of
thing in all languages that support your hardwareâs floating-point arithmetic
(although some languages may not display the difference by default, or in all
output modes).

For more pleasant output, you may wish to use string formatting to produce a limited number of significant digits:

>>> format(math.pi, '.12g')  # give 12 significant digits'3.14159265359'>>> format(math.pi, '.2f')   # give 2 digits after the point'3.14'>>> repr(math.pi)'3.141592653589793'

Itâs important to realize that this is, in a real sense, an illusion: youâre
simply rounding the display of the true machine value.

One illusion may beget another. For example, since 0.1 is not exactly 1/10,
summing three values of 0.1 may not yield exactly 0.3, either:

>>> .1 + .1 + .1 == .3False

Also, since the 0.1 cannot get any closer to the exact value of 1/10 and
0.3 cannot get any closer to the exact value of 3/10, then pre-rounding with
round() function cannot help:

>>> round(.1, 1) + round(.1, 1) + round(.1, 1) == round(.3, 1)False

Though the numbers cannot be made closer to their intended exact values,
the round() function can be useful for post-rounding so that results
with inexact values become comparable to one another:

>>> round(.1 + .1 + .1, 10) == round(.3, 10)True

Binary floating-point arithmetic holds many surprises like this. The problem
with â0.1â is explained in precise detail below, in the âRepresentation Errorâ
section. See The Perils of Floating Point
for a more complete account of other common surprises.

As that says near the end, âthere are no easy answers.â Still, donât be unduly
wary of floating-point! The errors in Python float operations are inherited
from the floating-point hardware, and on most machines are on the order of no
more than 1 part in 2**53 per operation. Thatâs more than adequate for most
tasks, but you do need to keep in mind that itâs not decimal arithmetic and
that every float operation can suffer a new rounding error.

While pathological cases do exist, for most casual use of floating-point
arithmetic youâll see the result you expect in the end if you simply round the
display of your final results to the number of decimal digits you expect.
str() usually suffices, and for finer control see the str.format()
methodâs format specifiers in Format String Syntax.

For use cases which require exact decimal representation, try using the
decimal module which implements decimal arithmetic suitable for
accounting applications and high-precision applications.

Another form of exact arithmetic is supported by the fractions module
which implements arithmetic based on rational numbers (so the numbers like
1/3 can be represented exactly).

If you are a heavy user of floating point operations you should take a look
at the NumPy package and many other packages for mathematical and
statistical operations supplied by the SciPy project. See <https://scipy.org>.

Python provides tools that may help on those rare occasions when you really
do want to know the exact value of a float. The
float.as_integer_ratio() method expresses the value of a float as a
fraction:

>>> x = 3.14159>>> x.as_integer_ratio()(3537115888337719, 1125899906842624)

Since the ratio is exact, it can be used to losslessly recreate the
original value:

>>> x == 3537115888337719 / 1125899906842624True

The float.hex() method expresses a float in hexadecimal (base
16), again giving the exact value stored by your computer:

>>> x.hex()'0x1.921f9f01b866ep+1'

This precise hexadecimal representation can be used to reconstruct
the float value exactly:

>>> x == float.fromhex('0x1.921f9f01b866ep+1')True

Since the representation is exact, it is useful for reliably porting values
across different versions of Python (platform independence) and exchanging
data with other languages that support the same format (such as Java and C99).

Another helpful tool is the math.fsum() function which helps mitigate
loss-of-precision during summation. It tracks âlost digitsâ as values are
added onto a running total. That can make a difference in overall accuracy
so that the errors do not accumulate to the point where they affect the
final total:

>>> sum([0.1] * 10) == 1.0False>>> math.fsum([0.1] * 10) == 1.0True

## 15.1. Representation ErrorÂ¶

This section explains the â0.1â example in detail, and shows how you can perform
an exact analysis of cases like this yourself. Basic familiarity with binary
floating-point representation is assumed.

Representation error refers to the fact that some (most, actually)
decimal fractions cannot be represented exactly as binary (base 2) fractions.
This is the chief reason why Python (or Perl, C, C++, Java, Fortran, and many
others) often wonât display the exact decimal number you expect.

Why is that? 1/10 is not exactly representable as a binary fraction. Almost all
machines today (November 2000) use IEEE-754 floating point arithmetic, and
almost all platforms map Python floats to IEEE-754 âdouble precisionâ. 754
doubles contain 53 bits of precision, so on input the computer strives to
convert 0.1 to the closest fraction it can of the form J/2**N where J is
an integer containing exactly 53 bits. Rewriting

1 / 10 ~= J / (2**N)

as

J ~= 2**N / 10

and recalling that J has exactly 53 bits (is >= 2**52 but < 2**53),
the best value for N is 56:

>>> 2**52 <=  2**56 // 10  < 2**53True

That is, 56 is the only value for N that leaves J with exactly 53 bits. The
best possible value for J is then that quotient rounded:

>>> q, r = divmod(2**56, 10)>>> r6

Since the remainder is more than half of 10, the best approximation is obtained
by rounding up:

>>> q+17205759403792794

Therefore the best possible approximation to 1/10 in 754 double precision is:

7205759403792794 / 2 ** 56

Dividing both the numerator and denominator by two reduces the fraction to:

3602879701896397 / 2 ** 55

Note that since we rounded up, this is actually a little bit larger than 1/10;
if we had not rounded up, the quotient would have been a little bit smaller than
1/10. But in no case can it be exactly 1/10!

So the computer never âseesâ 1/10: what it sees is the exact fraction given
above, the best 754 double approximation it can get:

>>> 0.1 * 2 ** 553602879701896397.0

If we multiply that fraction by 10**55, we can see the value out to
55 decimal digits:

>>> 3602879701896397 * 10 ** 55 // 2 ** 551000000000000000055511151231257827021181583404541015625

meaning that the exact number stored in the computer is equal to
the decimal value 0.1000000000000000055511151231257827021181583404541015625.
Instead of displaying the full decimal value, many languages (including
older versions of Python), round the result to 17 significant digits:

>>> format(0.1, '.17f')'0.10000000000000001'

The fractions and decimal modules make these calculations
easy:

>>> from decimal import Decimal>>> from fractions import Fraction>>> Fraction.from_float(0.1)Fraction(3602879701896397, 36028797018963968)>>> (0.1).as_integer_ratio()(3602879701896397, 36028797018963968)>>> Decimal.from_float(0.1)Decimal('0.1000000000000000055511151231257827021181583404541015625')>>> format(Decimal.from_float(0.1), '.17')'0.10000000000000001'