資工遊俠劉建春(AaA / Amzshar / 燕俠 / JCLIUL)之IT人柱力(仙人模式): [Python][Regular Expressions][正規表示式] Python’s Regex Symbols

2024年2月23日星期五

[Python][Regular Expressions][正規表示式] Python’s Regex Symbols

... ... ...

[Regular Expressions][正規表示式] Python’s Regex Symbols

... ... ...

# Sample Code1:
import re
phoneNumRegex1 = re.compile(r'\d{3}-\d{3}-\d{4}')
mo1 = phoneNumRegex1.search('My number is 415-555-4242.')
print('Phone number found: ' + mo1.group())
# Phone number found: 415-555-4242

1. Grouping with Parentheses: ( )

2. Matching Multiple Groups with the Pipe: | 亦即 or

3. Optional Matching with the Question Mark: (wo)? 亦即 Optional = Match zero or one

4. Matching Zero or More with the Star: (wo)* 亦即 Match zero or more

5. Matching One or More with the Plus: (wo)+ 亦即 Match one or more (at least one)

6. Matching Speciﬁc Repetitions with Braces: {3}

7. The ﬁndall() Method: 回傳 list

8. Making Your Own Character Classes: [aeiouAEIOU]

# Sample Code2:
import re
vowelRegex = re.compile(r'[aeiouAEIOU]')
mo2 = vowelRegex.findall('RoboCop eats baby food. BABY FOOD.')
print(type(mo2))
# <class 'list'>
print(mo2)
# ['o', 'o', 'o', 'e', 'a', 'a', 'o', 'o', 'A', 'O', 'O']

9. The Caret (^ 跳脫符號: begin with) and Dollar Sign ($: end with) Characters

10. The Wildcard Character: . (dot) = wildcard 亦即 Match any character except for a new line.

11. Matching Everything with Dot-Star (.*):

dot . 表示: any single character except the newline

star * 表示: zero or more of the preceding character

# Sample Code3:
import re
nameRegex = re.compile(r'First Name: (.*) Last Name: (.*)')
mo3 = nameRegex.search('First Name: Al Last Name: Amzshar') 
print(mo3.group(1))
# 'Al' 
print(mo3.group(2))
# 'Amzshar'

12. Matching Newlines with the Dot Character: newlineRegex = re.compile('.*', re.DOTALL)

The ? matches zero or one of the preceding group.

The * matches zero or more of the preceding group.

The + matches one or more of the preceding group.

The {n} matches exactly n of the preceding group.

The {n,} matches n or more of the preceding group.

The {,m} matches 0 to m of the preceding group.

The {n,m} matches at least n and at most m of the preceding group.

{n,m}? or *? or +? performs a non-greedy (also called lazy) match of the preceding group.

^spam means the string must begin with spam.

spam$ means the string must end with spam.

The . (dot) = wildcard matches any character, except newline characters.

\d , \w , and \s match a digit, word, or space character, respectively.

\D , \W , and \S match anything except a digit, word, or space character, respectively.

[abc] matches any character between the brackets (such as a, b, or c).

[^abc] matches any character that isn’t between the brackets.

(1) Basic Syntax

(2) Regex Character Classes

資工遊俠劉建春(AaA / Amzshar / 燕俠 / JCLIUL)之IT人柱力(仙人模式)

2024年2月23日星期五

[Python][Regular Expressions][正規表示式] Python’s Regex Symbols

沒有留言:

張貼留言

2024年2月23日 星期五

[Python][Regular Expressions][正規表示式] Python’s Regex Symbols

沒有留言:

張貼留言

2024年2月23日星期五