Python Programming Made Easy (2016)

Chapter 5: Strings

Strings are amongst the most popular types in Python. We can create them simply by enclosing characters in quotes. Python treats single quotes the same as double quotes.

Creating strings is as simple as assigning a value to a variable. For example:

str1 = 'Hello World!'

str2 = "Python Programming"

Accessing Values in Strings

Python does not support a character type; these are treated as strings of length one, thus also considered a substring. To access substrings, we use the square brackets for slicing along with the index or indices to obtain our substring. Following is a simple example:

str1 = 'Hello World!'

str2 = "Python Programming"

print "str1[0]: ", str1[0]

print "str2[7:11]: ", str2[7:11]

When the above code is executed, it produces the following result:

Fig 5.1: Accessing strings output screenshot

Updating Strings

We can "update" an existing string by (re)assigning a variable to another string. The new value can be related to its previous value or to a completely different string altogether. Following is a simple example:

str1 = 'Hello World!'

str2 = str1[:6] + 'Python'

print "Updated String :- ", str2

When the above code is executed, it produces the following result:

Fig 5.2: Updating strings

Escape Characters

Following table is a list of escape or non-printable characters that can be represented with backslash notation.

An escape character gets interpreted; in a single quoted as well as double quoted strings.

Backslash
notation

Description

\a

Bell or alert

\b

Backspace

\cx

Control-x

\C-x

Control-x

\e

Escape

\f

Formfeed

\M-\C-x

Meta-Control-x

\n

Newline

\nnn

Octal notation, where n is in the range 0.7

\r

Carriage return

\s

Space

\t

Tab

\v

Vertical tab

\x

Character x

\xnn

Hexadecimal notation, where n is in the range 0.9, a.f, or A.F

Table 5.1: Escape characters

String Special Operators

Assume string variable a holds 'Hello' and variable b holds 'Python', then:

Operator

Description

Example

+

Concatenation - Adds values on either side of the operator

a + b results in HelloPython

*

Repetition - Creates new strings, concatenating multiple copies of the same string

a*2 results in HelloHello

[]

Slice - Gives the character from the given index

a[1] results in e

[ : ]

Range Slice - Gives the characters from the given range

a[1:4] results in ell

In

Membership - Returns true if a character exists in the given string

H in a results in 1

not in

Membership - Returns true if a character does not exist in the given string

M not in a results in 1

r/R

Raw String - Suppresses actual meaning of Escape characters. The syntax for raw strings is exactly the same as for normal strings with the exception of the raw string operator, the letter "r," which precedes the quotation marks. The "r" can be lowercase (r) or uppercase (R) and must be placed immediately preceding the first quote mark.

print r'\n' prints \n and print R'\n' prints \n

%

Format - Performs String formatting

 

Table 5.2: String Special Operators

String Formatting Operator

One of Python's coolest features is the string format operator %. This operator is unique to strings and makes up for the pack of having functions from C's printf() family. Following is a simple example:

print "My name is %s and weight is %d kg!" % ('Tara', 50)

When the above code is executed, it produces the following result:

Fig 5.3: formatting using %

Here is the list of complete set of symbols which can be used along with %:

Format Symbol

Conversion

%c

Character

%s

string conversion via str() prior to formatting

%i

signed decimal integer

%d

signed decimal integer

%u

unsigned decimal integer

%o

octal integer

%x

hexadecimal integer (lowercase letters)

%X

hexadecimal integer (UPPERcase letters)

%e

exponential notation (with lowercase 'e')

%E

exponential notation (with UPPERcase 'E')

%f

floating point real number

%g

the shorter of %f and %e

%G

the shorter of %f and %E

Table 5.3: format symbols

Built-in string functions

Python includes the following built-in methods to manipulate strings:

SNo

Methods

Description

Example

1

capitalize()
 

Capitalizes first letter of string

str="hello python "

print str.capitalize()

Hello python

2

endswith(suffix, beg=0, end=len(string))
 

Determines if string or a substring of string (if starting index beg and ending index end are given) ends with suffix; returns true if so and false otherwise

Computer".endswith("er")

True

3

find(str, beg=0 end=len(string))
 

Determine if str occurs in string or in a substring of string if starting index beg and ending index end are given returns index if found and -1 otherwise

str='Computer'

print str.find('put')

>>3

# On omitting the start parameters, the function starts the search from

# the beginning.

print str.find('put',2)

>>3

print str.find('put',1,3)

>> -1

# Displays -1 because the substring could not be found between the  index 1 and 3-1

4

index(str, beg=0, end=len(string))
 

Same as find(), but raises an exception if str not found

"Computer".index('pat')

Traceback (most recent call last):

  File "<pyshell#1>", line 1, in <module>

"Computer".index('pat')

ValueError: substring not found

5

isalnum()
 

Returns true if string has at least 1 character and all characters are alphanumeric and false otherwise

str='Hello Python'

print str.isalnum()

False

# The function returns False as space is an alphanumeric character.

print "Python123".isalnum()

True

6

isalpha()
 

Returns true if string has at least 1 character and all characters are alphabetic and false otherwise

print 'Python123'.isalpha()

False

print 'python'.isalpha()

True

7

isdigit()
 

Returns true if string contains only digits and false otherwise

str = '1234'

print str.isdigit()

True

8

islower()
 

Returns true if string has at least 1 cased character and all cased characters are in lowercase and false otherwise

print 'HELLO python'.islower()

hello python

9

isnumeric()
 

Returns true if a unicode string contains only numeric characters and false otherwise

str = u"year2013"

print str.isnumeric()

False

10

isspace()
 

Returns true if string contains only whitespace characters and false otherwise

str=' '

print str.isspace()

True

11

istitle()
 

Returns true if string is properly "titlecased" and false otherwise

print 'Hello World'.istitle()

True

12

isupper()
 

Returns true if string has at least one cased character and all cased characters are in uppercase and false otherwise

print "'HELLO python'.isupper()

True

13

join(seq)
 

Merges (concatenates) the string representations of elements in sequence seq into a string, with separator string

str1=('10', 'oct' ,'2013')

str="-"

print str.join(str1)

10-oct-2013

14

len(string)
 

Returns the length of the string

len("Computer")

8

15

lower()
 

Converts all uppercase letters in string to lowercase

"Optical Fibre".lower()

'optical fibre'

16

lstrip()
 

Removes all leading whitespace in string

" Python ".lstrip()

'Python '

17

max(str)
 

Returns the max alphabetical character from the string str

max("Python")

‘y’

18

min(str)
 

Returns the min alphabetical character from the string str

min("Python")

‘P’

19

replace(old, new [, max])
 

Replaces all occurrences of old in string with new or at most max occurrences if max given

"C++ is a powerful language".replace("C++","Python")

'Python is a powerful language'

20

rfind(str, beg=0,end=len(string))
 

Same as find(), but search backwards in string

"Python Program".rfind("P")

7

21

rindex( str, beg=0, end=len(string))
 

Same as index(), but search backwards in string

"Python Program".rindex("P")

7

22

rstrip()
 

Removes all trailing whitespace of string

" Python ".rstrip()

' Python'

23

split(str="", num=string.count(str))
 

Splits string according to delimiter str (space if not provided) and returns list of substrings; split into at most num substrings if given

"1:Anand-2:Babu-3:Charles-4:Dravid".split('-')

['1:Anand', '2:Babu', '3:Charles', '4:Dravid']

24

splitlines( num=string.count('\n'))
 

Splits string at all (or num) NEWLINEs and returns a list of each line with NEWLINEs removed

"1:Anu-\n2:Banu-\n3:Charles-\n4:David".splitlines(3)

['1:Anu-\n', '2:Banu-\n', '3:Charles-\n', '4:David']

25

startswith(str, beg=0,end=len(string))
 

Determines if string or a substring of string (if starting index beg and ending index end are given) starts with substring str; returns true if so and false otherwise

"Python is a free software…….".startswith("Python")

True

26

strip([chars])
 

Performs both lstrip() and rstrip() on string

"  Python ".strip()

'Python'

27

swapcase()
 

Inverts case for all letters in string

"python PROGRAMMING".swapcase()

'PYTHON programming'

28

title()
 

Returns "titlecased" version of string, that is, all words begin with uppercase and the rest are lowercase

"python PROGRAMMING".title()

'Python Programming'

29

upper()
 

Converts lowercase letters in string to uppercase

"python".upper()

'PYTHON'

30

zfill (width)
 

Returns original string leftpadded with zeros to a total of width characters; intended for numbers, zfill() retains any sign given (less one zero)

"1234".zfill(10)

'0000001234'

Table 5.4: Built-in string functions

String constants

Constant

Description

Example

string.ascii_uppercase

The command displays a string containing uppercase characters.

>>> string.ascii_uppercase

'ABCDEFGHIJKLMNOPQRSTUVWXYZ'

string.ascii_lowercase

The command displays a string containing all lowercase characters.

>>> string.ascii_lowercase

'abcdefghijklmnopqrstuvwxyz'

string.ascii_letters

The command displays a string containing both uppercase and lowercase characters.

>>> string.ascii_letters

'abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ'

string.digits

The command displays a string containing digits.

>>> string.digits

'0123456789'

string.hexdigits

The command displays a string containing hexadecimal characters.

>>> string.hexdigits

'0123456789abcdefABCDEF'

string.octdigits

The command displays a string containing octal characters

>>> string.octdigits

'01234567'

string.punctuations

The command displays a string containing all the punctuation characters.

>>> string.punctuations

'!"#$%&\'()*+,-./:;<=>?@[\\]^_`{|}-'

string.whitespace

The command displays a string containing all ASCII characters that are considered

whitespace. This includes the characters space, tab, linefeed, return, formfeed, and vertical tab.

>>> string.whitespace

'\t\n\x0b\x0c\r '

string.printable

The command displays a string containing all characters which are considered printable like letters, digits, punctuations and whitespaces.

>>> string.printable

'0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ!

"#$%&\'()*+,-./:;<=>?@[\\]^_`{|}- \t\n\r\x0b\x0c'

Table 5.6: string constants

Example 5.1: Write a program to check whether the string is a palindrome or not.

def palin():

       str=input("Enter the String")

       l=len(str)

       p=l-1

       index=0

       while (index<p):

               if(str[index]==str[p]):

                       index=index+1

                       p=p-1

               else:

                       print "String is not a palindrome"

                       break

       else:

               print "String is a Palindrome"

Execution steps:

Fig 5.4: Palindrome execution steps

Output:

String is a Palidrome

Regular expressions and Pattern matching

A regular expression is a sequence of letters and some special characters (also called meta characters). These special characters have symbolic meaning. The sequence formed by using meta characters and letters can be used to represent a group of patterns.

Regular expressions can be used in python for matching a particular pattern by importing the re module.

Note: re module includes functions for working on regular expression.

The following table explains how the meta characters are used to form regular expressions.

S.No

MetaCharacter

Usage

Example

1

[ ]

Used to match a set of characters

[star]

The regular expression

would match any of the

characters s, t , a or r.

[a-z]

The regular expression

would match only

lowercase characters.

2

^

Used to complementing a set of characters

[^star]

The regular expression

would match any other

characters than s, t , a or r.

3

$

Used to match the end of string only

Star$

The regular expression

would match Star in

BlueStar but will not match

Star in Stardom

4

*

Used to specify that the previous character can be matched zero or more times.

B*e

The regular expression

would match strings like

Bye, Blue, Bee and so on.

5

+

Used to specify that the previous character can be matched one or more times.

B+r

The regular expression would match strings like Beer, Baar and so on.

6

?

Used to specify that the previous character can be matched either

once or zero times

B?ar

The regular expression would only match strings like Bar or Bear

7

{ }

The curly brackets accept two integer value s. The first value

specifies the minimum no of occurrences and second value specifies the maximum of

occurrences

wate{1,4}r

The regular expression

would match only strings

water, wateer, wateeer or

wateeeer

Table 5.7: meta characters

Functions from re module

Function

Description

re.compile()

compile the  pattern into pattern objects. After the compilation the pattern objects will be able to access methods for various operations like searching and substitutions

re.match()

The match function is used to determine if the regular expression (RE) matches at the beginning of the string.

re.group()

The group function is used to return the string matched the RE

re.start()

The start function returns the starting position of the match.

re.end()

The end function returns the end position of the match.

re.span()

The span function returns the tuple containing the (start, end) positions of the match

re.search()

The search function traverses through the string and determines the position where the RE matches the string

re.findall()

The function determines all substrings where the RE matches, and returns them as a list.

re.finditer()

The function  determines all substrings where the RE matches, and returns them as an iterator.

Table 5.8: functions in re module

Example 5.2: Demonstration of re functions

import re

P=re.compile("hell*o")

m=re.match("hell*o", "hellooooo python")

print m.group()

m=re.match('hell*o', 'hellooooo python')

print m.start()

m=re.match('hell*o', 'hellooooo python')

print m.end()

m=re.match('hell*o', 'hellooooo python')

print m.span()

m=re.search('hell*o', 'hellooo python')

print m

m=re.findall('hell*o', 'hello helloo python')

print m       

Solved Programs

1.              Write a program to count no of ‘s’ in the string ‘mississippi’.

Ans:

def lettercount():

       word = 'mississippi'

       count = 0

       for letter in word:

               if letter == 's':

                       count = count + 1

                       print(count)

2.              Write a program that reads a string and display the longest substring of the given string having just the consonants.

Ans:

string = raw_input (‘‘Enter a string :’’)

length = len (string)

max length = 0

max sub = ‘ ’

sub = ‘ ’

lensub = 0

for a in range (length) :

     if string [a] in ‘aeiou ‘or string [a] in ‘AEIOU’:

             if lensub > maxlength :

                 maxsub = sub

                 maxlength = lensub

                sub = ‘ ’

                 lensub = 0

       else :

             sub + = string [a]

             lensub = len (sub)

             a + = 1

       print ‘‘Maximum length consonent substring is :’’, maxsub,

       print ‘‘with’’, maxlength, ‘‘characters’’

3.   Write a program to determine if the given substring is present in the string.

import re

substring='Rain'

search1=re.search(substring,'Rain Rain go away !')

if search1:

    position=search1.start()

    print "matched", substring, "at position", position

else:

    print “No match found”

4.   Write a program to determine if the given substring (defined using meta characters) is present in the given string

   Ans:

import re

p=re.compile('dance+')

search1=re.search(p,'Western dancers dance for English music well')

if search1:

    match=search1.group()

    print "matched =",match

    index=search1.start()

    print "at position",index

else:

             print "No match found"

5.   Write Python script that takes a string with multiple words and then capitalizes the first letter of each word and forms a new string out of it.

    Ans.

string = raw_input (‘‘Enter a string :’’)

length = len (string)

= 0

end = length

string 2 = ‘ ’ # empty string

while a < length

       if a = = 0

               string 2 + = string [0].upper() a + = 1

       elif (string [a]==‘ ’ and string [a+1) !=‘ ’) :

       string 2 + = string [a]

       string 2 + = string [a+1].upper() a + = 2

       else :

               string 2 + = string [a]

       a + = 1              

print ‘‘Original string :’’, string

print ‘‘Converted string :’’, string2

6. Which string method is used to implement the following?

a) To count the number of characters in the string.

b) To change the first character of the string in capital letter.

c) To check whether given character is letter or a number.

d) To change lower case to upper case letter.

e) Change one character into another character.

Ans.

a) len(str)

b) str.title()  or str.capitalize()

c) str.isalpha and str.isdigit()

d) lower(str[i])

e) str.replace(char, newchar)

7. Write a program to input any string and to find number of words in the string.

Ans.

str = "Honesty is the best policy"

words = str.split()

print len(words)

8. Consider the string str=”A Friend Indeed”.  Write statements in Python to implement the     following

a) To display the last four characters.

b) To display the substring starting from index 4 and ending at index 8.

c) To check whether string has alphanumeric characters or not.

d) To trim the last four characters from the string.

e) To trim the first four characters from the string.

f) To display the starting index for the substring “end”.

g) To change the case of the given string.

h) To check if the string is in title case.

i) To replace all the occurrences of letter ‘e’ in the string with ‘*’?

Ans:

a)str[-4:]

b)str[4:8]

c)str.isalnum()

d)str = str[4:]

e)str=str[-4:]

f)str.find(“end”)

g)str.swapcase()

h)str.istitle()

i)str.replace(‘e’,’*’)

Practice Problems

1. Write a program in Python to count the number of vowels in a given word.

2.  Write a program using regular expressions in Python to validate passwords in a given list of passwords.

3.  Write a program to partition the string at the occurrence of a given letter.

4.  Write a program in Python to sort a given array of student names in alphabetical order.

5. Consider the string str=” Hello Python”. Write statements in python to implement the following

a)  To display the last six characters.

b)  To display the substring starting from index 2 and ending at index 6

c)  To check whether string has alphanumeric characters or not.

d)  To trim the last six characters from the string.

e)  To trim the first six characters from the string.

f)  To change the case of the given string.

g)  To check if the string is in title case.