Working with Strings in Python

0
376
Strings in Python
Strings in Python

1. What is a String?

A string is an immutable sequence of one or more Unicode characters. In Python, strings are created by enclosing a sequence of characters inside a pair of either single quotes (‘…’) or double quotes (“…”).

'A string can be enclosed within single quotes.'
"A string can also be enclosed within double quotes."

2. Using Escape Sequences Within Strings

An escape sequence is a sequence of characters used within a string which is translated into another character or sequence of characters that are otherwise not possible to put directly into a string. In Python, the backslash (\) is called the escape character. Escape sequences start with the escape character (\) and consist of one or more characters. For example, \n is an escape sequence which represents a newline character.

>>> message = "An example of \n escape sequences."
>>> print(message)
An example of 
 escape sequences.

3. Raw Strings

A raw string in Python is a string which treats the escape character, i.e. the backslash (\), as a literal character. Python raw strings are helpful in situations when we don’t want the backslash to be treated as the escape character and have it as part of the string. A Python raw string is created by prefixing a string literal with ‘r’ or ‘R’.

>>> message = r"This is a \n raw string."
>>> print(message)
This is a \n raw string.

4. Multiline Strings in Python

Multiline strings in Python can be created using either three single quotes or three double quotes.

>>> message = '''This is a
… multiline string enclosed
… within a pair of three
… single quotes.'''
>>> print(message)
This is a
multiline string enclosed
within a pair of three
single quotes.
>>> message = """This is a
… multiline string enclosed
… within a pair of three
… double quotes."""
>>> print(message)
This is a
multiline string enclosed
within a pair of three
double quotes.

5. String Concatenation

Concatenation is the process in which we append one string to the end of another string. In Python, strings can be concatenated using the + operator.

Example 1

>>> message = "Hello " + "World!"
>>> print(message)
Hello World!

Example 2

>>> var1 = "Hello "
>>> var2 = "World!"
>>> message = var1 + var2
>>> print(message)
Hello World!

Example 3

>>> var = "Hello "
>>> message = var + "World!"
>>> print(message)
Hello World!

Example 4

Two string literals next to each other are automatically concatenated even without the + operator as shown below:

>>> message = "Hello " "World!"
>>> print(message)
Hello World!

It is important to note that the above approach works only with string literals. Trying to concatenate a string literal with a variable or an expression without the + operator will produce an error as shown below:

>>> var = "Hello "
>>>  message = var "World!"
  File "<stdin>", line 1
     message = var "World!"
                          ^
 SyntaxError: invalid syntax

6. String Repetition

In Python, a string can be repeated a specified number of times using the * operator as shown below:

>>> message = 3 * "Hi"
>>> print(message)
HiHiHi

7. String Methods

In Python, strings support a number of methods to perform common operations. Some of the most commonly used Python string methods are listed below. It is important to note that strings are immutable in Python. Therefore, all string methods return new values without changing the original string.

str.lower()

It converts a string into lower case.

>>> message = "Hello World"
>>> print(message.lower())
hello world

str.upper()

It converts a string into upper case.

>>> message = "Hello World"
>>> print(message.upper())
HELLO WORLD

str.capitalize()

It converts the first character of a string to upper case.

>>> message = "hello world"
>>> print(message.capitalize())
Hello world

str.title()

It converts the first character of each word of a string to upper case.

>>> message = "hello world"
>>> print(message.title())
Hello World

str.swapcase()

It swaps cases of the characters of a string. Lower case becomes upper case and vice versa.

>>> message = "Hello World"
>>> print(message.swapcase())
hELLO wORLD

str.count()

It returns the number of times a specified value occurs in a string.

>>> message = "Hello World"
>>> print(message.count("o"))
2
>>> print(message.count("ll"))
1

str.startswith()

It returns true if the string starts with the specified value.

>>> message = "Hello World"
>>> print(message.startswith("He"))
True
>>> print(message.startswith("M"))
False

str.endswith()

It returns true if the string ends with the specified value.

>>> message = "Hello World"
>>> print(message.endswith("ld"))
True
>>> print(message.endswith("M"))
False

str.find()

It searches the string for a specified value and returns the position of where it was found. It returns -1 if the value is not found.

>>> message = "Hello World"
>>> print(message.find("l"))
2
>>> print(message.find("L"))
-1

str.rfind()

It searches the string for a specified value and returns the last position of where it was found. It returns -1 if the value is not found.

>>> message = "Hello World"
>>> print(message.rfind("l"))
9
>>> print(message.rfind("L"))
-1

str.index()

It searches the string for a specified value and returns the position of where it was found. It raises an exception if the value is not found.

>>> message = "Hello World"
>>> print(message.index("l"))
2
>>> print(message.index("L"))
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ValueError: substring not found

str.rindex()

It searches the string for a specified value and returns the last position of where it was found. It raises an exception if the value is not found.

>>> message = "Hello World"
>>> print(message.rindex("l"))
9
>>> print(message.rindex("L"))
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ValueError: substring not found

str.isalnum()

It returns True if all characters in the string are alphanumeric, meaning alphabet letter (a-z) and numbers (0-9).

>>> message = "HelloWorld1"
>>> print(message.isalnum())
True
>>> message = "Hello World"
>>> print(message.isalnum())
False

str.isalpha()

It returns True if all characters in the string are alphabet letters (a-z).

>>> message = "HelloWorld1"
>>> print(message.isalpha())
False
>>> message = "HelloWorld"
>>> print(message.isalpha())
True

str.isdecimal()

It returns True if all characters in the string are decimals (0-9).

>>> message = "HelloWorld1"
>>> print(message.isdecimal())
False
>>> message = "123"
>>> print(message.isdecimal())
True
>>> message = "\u0030" # unicode for 0
>>> print(message.isdecimal())
True
>>> message = "\u00B2" # unicode for ² (exponent)
>>> print(message.isdecimal())
False

str.isdigit()

It returns True if all characters in the string are digits, otherwise False. It considers exponents, like ², as digits.

>>> message = "HelloWorld1"
>>> print(message.isdigit())
False
>>> message = "123"
>>> print(message.isdigit())
True
>>> message = "\u0030" # unicode for 0
>>> print(message.isdigit())
True
>>> message = "\u00B2" # unicode for ² (exponent)
>>> print(message.isdigit())
True

str.islower()

It returns True if all characters in the string are in lower case. It checks only alphabet characters. Numbers, symbols, and spaces are ignored.

>>> message = "abc123"
>>> print(message.islower())
True
>>> message = "abc def"
>>> print(message.islower())
True
>>> message = "ABC def"
>>> print(message.islower())
False

str.isupper()

It returns True if all characters in the string are in upper case. It checks only alphabet characters. Numbers, symbols, and spaces are ignored.

>>> message = "ABC123"
>>> print(message.isupper())
True
>>> message = "ABC DEF"
>>> print(message.isupper())
True
>>> message = "ABC def"
>>> print(message.isupper())
False

str.istitle()

It returns True if all words in a string start with an upper case letter, and the rest of the word are lower case letters. Numbers, symbols, and spaces are ignored.

>>> message = "Hello World"
>>> print(message.istitle())
True
>>> message = "Abc123"
>>> print(message.istitle())
True
>>> message = "Abc123 Def"
>>> print(message.istitle())
True
>>> message = "ABC def"
>>> print(message.istitle())
False

str.lstrip()

It removes the specified leading characters from a string. Space is considered to be the default leading character to remove.

>>> message = "    Hello World    "
>>> print("Start" + message + "End")
Start    Hello World    End
>>> print("Start" + message.lstrip() + "End")
StartHello World    End
>>> message = "aaaHello Worldaaa"
>>> print(message.lstrip("aaa"))
Hello Worldaaa

str.rstrip()

It removes the specified trailing characters from a string. Space is considered to be the default trailing character to remove.

>>> message = "    Hello World    "
>>> print("Start" + message + "End")
Start    Hello World    End
>>> print("Start" + message.rstrip() + "End")
Start    Hello WorldEnd
>>> message = "aaaHello Worldaaa"
>>> print(message.rstrip("aaa"))
aaaHello World

str.strip()

It removes the specified leading and trailing characters from a string. Space is considered to be the default trailing character to remove.

>>> message = "    Hello World    "
>>> print("Start" + message + "End")
Start    Hello World    End
>>> print("Start" + message.strip() + "End")
StartHello WorldEnd
>>> message = "aaaHello Worldaaa"
>>> print(message.strip("aaa"))
Hello World

str.split()

It splits a string into a list at the specified separator. Any whitespace is considered to be the default separator. This method accepts an optional second parameter called ‘max’. If this is specified, the list will contain the specified number of elements plus one (max + 1).

>>> message = "Welcome to Python"
>>> print(message.split())
['Welcome', 'to', 'Python']
>>> print(message.split("o"))
['Welc', 'me t', ' Pyth', 'n']
>>> print(message.split("o", 2)) # The list will contain 3 elements
['Welc', 'me t', ' Python']

str.splitlines()

It splits a string into a list and the splitting is done at line breaks. This method accepts an optional second parameter called ‘keeplinebreaks’. It specifies if the line breaks in the string should be included (True), or not (False). The default value is False.

>>> message = "This is line 1\nThis is line 2"
>>> print(message.splitlines())
['This is line 1', 'This is line 2']
>>> print(message.splitlines(True))
['This is line 1\n', 'This is line 2']

str.replace()

It replaces a specified phrase within a string with another specified phrase.

>>> message = "Hello World"
>>> print(message.replace("o", "ff"))
Hellff Wffrld

8. String Indexing

Since strings in Python are sequences made up of characters, they can be indexed, meaning that each of a string’s characters corresponds to an index number. The first character has an index 0.

Accessing Characters Using a Positive Index Number

We can use a positive index number in square brackets and access a character as shown below:

>>> message = "Hello World"
>>> print(message[0])
H
>>> print(message[6])
W

Accessing Characters Using a Negative Index Number

Python also supports negative index numbers. We can use negative indices with strings to start counting from the right, starting at the index number -1.

>>> message = "Hello World"
>>> print(message[-1])
d
>>> print(message[-3])
r

9. String Slicing

String slicing is the process of extracting a range of characters from a string. We can create a slice or a substring from an original string by using a range of index numbers separated by a colon in the format [x:y]. Here, the first index number is inclusive and specifies where the slice starts from. The second index number is exclusive and specifies where the slice ends.

>>> message = "Hello World"
>>> print(message[1:4])
ell

We can include either end of a string by omitting one of the index numbers. For example, to create a slice that starts at the beginning of the string and ends in the middle, we have to specify only the index number after the colon as shown below:

>>> message = "Hello World"
>>> print(message[:4])
Hell

Likewise, we can create a slice that starts in the middle of the string and includes each character until the end of the string by specifying only the index number before the colon as shown below:

>>> message = "Hello World"
>>> print(message[4:])
o World

We can also omit both the index numbers to create a slice that starts at the start of the string and ends at the end of the string. This will be beneficial in combination with the stride parameter discussed below.

>>> message = "Hello World"
>>> print(message[:])
Hello World

It is also possible to use negative index numbers for string slicing.

>>> message = "Hello World"
>>> print(message[-5:-1])
Worl

Specifying Stride While Slicing Strings

In the examples above, we have used two index numbers to slice a string. It is also possible to use a third parameter called ‘stride’ which specifies how many characters to move forward after the first character has been retrieved from the string. The default value of stride is 1.

>>> message = "Hello World"
>>> print(message[1:4:1])
ell
>>> print(message[1:4:2])
el
>>> print(message[1:8:2])
el o

A negative value for stride specifies that we want to extract characters from the string in reverse order.

>>> message = "Hello World"
>>> print(message[4:1:-1])
oll
>>> print(message[-1:-4:-1])
dlr
>>> print(message[-2:-9:-2])
lo l
>>> print(message[::-1])
dlroW olleH
Working with Strings in Python
5 (100%) 8 vote[s]

LEAVE A REPLY

Please enter your comment
Please enter your name