regex - How can I search and replace a term with brackets in Python without catastrophic backtracking? -
i trying find terms these (latex definitions)
\def\fb{\mathfrak{b}} and remove finish term \def\fb{\mathfrak{b}} replacing \fb \mathfrak{b}.
i came next regex so:
curly = "(?:\{(?:.*)?\})" # create sure number of brackets right target = "([^\{]*?"+curly+"*)" search = r"(\\[a-za-z][a-za-z0-9]*)" defcommand = re.compile(r"\\def" + search + "\{" + target + "+\}") but when run there seems happen catastrophic backtracking next minimal (not) working illustration shows:
#!/usr/bin/env python # -*- coding: utf-8 -*- import re text = r""" \newcommand*{\xindex}[1]{% \stepcounter{indexanchor}% create anchor unique \def\theindexterm{#1}% \edef\doindexentry{\noexpand\index {\expandonce\theindexterm|indexanchor{index-\theindexanchor}}}% \raisebox{\baselineskip}{\hypertarget{index-\theindexanchor}% {\doindexentry}}% } \def\fb{\mathfrak{b}}%für basis \def\cals{\mathcal{s}}%für subbasis \def\ft{\mathfrak{t}}%für topologie \def\fu{\mathfrak{u}}%für topologie \newlist{aufgabeenum}{enumerate}{1} \setlist[aufgabeenum]{label=(\alph*),ref=\textup{\theaufgabe~(\alph*)}} \crefalias{aufgabeenumi}{aufgabe} % commands local abbreviations """ def print_matched_groups(m): print("number of groups: %i" % defcommand.groups) in range(defcommand.groups): print("group %i: %s" % (i, m.group(i))) print("done print_matched_groups") curly = "(?:\{(?:.*)?\})" target = "([^\{]*?"+curly+"*)" search = r"(\\[a-za-z][a-za-z0-9]*)" defcommand = re.compile(r"\\def"+search+"\{"+target+"+\}") m in defcommand.finditer(text): print_matched_groups(m) print("finished") how can
i've found practical solution. know nesting not deep, can utilize info create regex faster:
curly = "(?:\{(?:.*)?\})" # create sure number of brackets right target = "([^\{]*?"+curly+"{0, 3})" # here alter search = r"(\\[a-za-z][a-za-z0-9]*)" defcommand = re.compile(r"\\def" + search + "\{" + target + "+\}") python regex latex
No comments:
Post a Comment