URL parsing: A ticking time bomb of security exploits

2 years ago 638

The modern satellite would grind to a halt without URLs, but years of inconsistent parsing specifications person created an situation ripe for exploitation that puts countless businesses astatine risk.

Web browser closeup connected  LCD surface  with shallow absorption   connected  https padlock

Image: RobertAx, Getty Images/iStockphoto

A squad of information researchers has discovered serious flaws successful the mode the modern net parses URLs: Specifically, that determination are excessively galore URL parsers with inconsistent rules, which has created a worldwide web easy exploited by savvy attackers.

We don't adjacent request to look precise hard to find an illustration of URL parsing being manipulated successful the chaotic to devastating effect: The late-2021 Log4j exploit is simply a cleanable example, the researchers said successful their report. 

"Because of Log4j's popularity, millions of servers and applications were affected, forcing administrators to find wherever Log4j whitethorn beryllium successful their environments and their vulnerability to proof-of-concept attacks successful the wild," the study said. 

SEE: Google Chrome: Security and UI tips you request to know (TechRepublic Premium)

Without going excessively profoundly into Log4j, the basics are that it uses a malicious drawstring that, erstwhile logged, would trigger a Java lookup that connects the unfortunate to the attacker's machine, which is utilized to present a payload. 

The remedy that was initially implemented for Log4j progressive lone allowing Java lookups to whitelisted sites. Attackers pivoted rapidly to find a mode astir the fix, and recovered retired that, by adding the localhost to the malicious URL and separating it with a # symbol, attackers were capable to confuse the parsers and transportation connected attacking.

Log4j was serious; the information that it relied connected thing arsenic cosmopolitan arsenic URLs makes it adjacent much so. To marque URL parsing vulnerabilities understandably dangerous, it helps to cognize what precisely it means, and the study does a bully occupation of doing conscionable that.

url-structure.jpg

Figure A: The 5 parts of a URL

Image: Claroty/Team82/Snyk

The color-coded URL successful Figure A shows an code breached down into its 5 antithetic parts. In 1994, mode backmost erstwhile URLs were archetypal defined, systems for translating URLs into instrumentality connection were created, and since past respective caller requests for remark (RFC) person further elaborated connected URL standards. 

Unfortunately, not each parsers person kept up with newer standards, which means determination are a batch of parsers, and galore person antithetic ideas of however to construe a URL. Therein lies the problem.

URL parsing flaws: What researchers found

Researchers astatine Team82 and Snyk worked unneurotic to analyse 16 antithetic URL parsing libraries and tools written successful a assortment of languages:

  1. urllib (Python)
  2. urllib3 (Python)
  3. rfc3986 (Python)
  4. httptools (Python)
  5. curl lib (cURL)
  6. Wget 
  7. Chrome (Browser)
  8. Uri (.NET)
  9. URL (Java)
  10. URI (Java)
  11. parse_url (PHP)
  12. url (NodeJS)
  13. url-parse (NodeJS) 
  14. net/url (Go)
  15. uri (Ruby)
  16. URI (Perl)

Their analyses of those parsers identified 5 antithetic scenarios successful which astir URL parsers behave successful unexpected ways:

  • Scheme confusion, successful which the attacker uses a malformed URL scheme
  • Slash confusion, which involves utilizing an unexpected fig of slashes
  • Backslash confusion, which involves putting immoderate backslashes (\) into a URL
  • URL-encoded information confusion, which impact URLs that incorporate URL-encoded data
  • Scheme mixup, which involves parsing a URL with a circumstantial strategy (HTTP, HTTPS, etc.)

Eight documented and patched vulnerabilities were identified successful the people of the research, but the squad said that unsupported versions of Flask inactive incorporate these vulnerabilities: You've been warned.

What you tin bash to debar URL parsing attacks

It's a bully thought to support yourself—proactively—against vulnerabilities with the imaginable to wreak havoc connected the Log4j scale, but fixed the low-level necessity of URL parsers, it mightiness not beryllium easy.

The study authors urge starting by taking the clip to place the parsers utilized successful your software, recognize however they behave differently, what benignant of URLs they enactment and more. Additionally, ne'er spot user-supplied URLs: Canonize and validate them first, with parser differences being accounted for successful the validation process. 

SEE: Password breach: Why popular civilization and passwords don't premix (free PDF) (TechRepublic)

The study besides has immoderate wide champion signifier tips for URL parsing that tin assistance minimize the imaginable of falling unfortunate to a parsing attack:

  • Try to usage arsenic few, oregon no, URL parsers astatine all. The study authors accidental "it is easy achievable successful galore cases." 
  • If utilizing microservices, parse the URL astatine the beforehand extremity and nonstop the parsed info crossed environments. 
  • Parsers progressive with exertion concern logic often behave differently. Understand those differences and however they impact further systems.
  • Canonicalize earlier parsing. That way, adjacent if a malicious URL is present, the known trusted 1 is what gets forwarded to the parser and beyond.

Cybersecurity Insider Newsletter

Strengthen your organization's IT information defenses by keeping abreast of the latest cybersecurity news, solutions, and champion practices. Delivered Tuesdays and Thursdays

Sign up today

Also spot

Read Entire Article