Substring is a sequence of characters within a string. In Python we can get different types of substrings using slicing operation.
Syntax of Python substring:
string[start:end:step]
start: Starting index of the substring. The substring will include the character of this index. Starting index will be 0 if not specified.
end: End index of the substring. The substring will not include the character of this index. If not specified or the specified value exceeds the length of the original string, the end index will be set to the length of the string.
step: ‘step‘ specifies the difference between the indexes of two consecutive characters included in the substring. If ‘step‘ is specified as 1, all characters within the range of ‘start‘ and ‘end‘ will be included in the substring. If ‘step’ is specified as 2, then the alternate characters will be included. And so on. If not specified, default value is 1.
These index values can be both positive and negative. We all familiar with positive indexing. The diagram below shows positive and negative indexing of the characters within a string.
Substring Examples
First 7 characters
str = "qnaplusdotcom"
print(str[:7])
Output:
qnaplus
the statement, str[:7], returns the first 7 characters of the original string str. The start index is not specified here – assumed to be 0. So the statement, str[0:7] will return the same substring.
Substring of length 7 starting from index 3
str = "qnaplusdotcom"
print(str[3:10])
Output:
plusdot
Here the ‘start‘ index is specified as 3 and the ‘end‘ index as 10 (3+7). The substring will have the characters starting from index 3 to 9 (end-1).
All characters except first 3
str = "qnaplusdotcom"
print(str[3:])
Output:
plusdotcom
Here the substring starts from index 3 – leaving first 3 characters with index 0, 1 and 2. As the ‘end‘ index is not specified, it is considered as the length of the string.
Last 3 characters
str = "qnaplusdotcom"
print(str[-3:])
Output:
com
As mentioned earlier, -3 represents the third last character of the original string. So the statement, str[-3:], returns all characters starting from third last character.
All but last 3 characters
str = "qnaplusdotcom"
print(str[:-3])
Output:
qnaplusdot
The statement, str[:-3], returns all characters starting from index 0 to -4 (-3-1).