1 | /**************************************************************************** |
2 | ** |
3 | ** Copyright (C) 2016 The Qt Company Ltd. |
4 | ** Copyright (C) 2016 Intel Corporation. |
5 | ** Contact: https://www.qt.io/licensing/ |
6 | ** |
7 | ** This file is part of the QtCore module of the Qt Toolkit. |
8 | ** |
9 | ** $QT_BEGIN_LICENSE:LGPL$ |
10 | ** Commercial License Usage |
11 | ** Licensees holding valid commercial Qt licenses may use this file in |
12 | ** accordance with the commercial license agreement provided with the |
13 | ** Software or, alternatively, in accordance with the terms contained in |
14 | ** a written agreement between you and The Qt Company. For licensing terms |
15 | ** and conditions see https://www.qt.io/terms-conditions. For further |
16 | ** information use the contact form at https://www.qt.io/contact-us. |
17 | ** |
18 | ** GNU Lesser General Public License Usage |
19 | ** Alternatively, this file may be used under the terms of the GNU Lesser |
20 | ** General Public License version 3 as published by the Free Software |
21 | ** Foundation and appearing in the file LICENSE.LGPL3 included in the |
22 | ** packaging of this file. Please review the following information to |
23 | ** ensure the GNU Lesser General Public License version 3 requirements |
24 | ** will be met: https://www.gnu.org/licenses/lgpl-3.0.html. |
25 | ** |
26 | ** GNU General Public License Usage |
27 | ** Alternatively, this file may be used under the terms of the GNU |
28 | ** General Public License version 2.0 or (at your option) the GNU General |
29 | ** Public license version 3 or any later version approved by the KDE Free |
30 | ** Qt Foundation. The licenses are as published by the Free Software |
31 | ** Foundation and appearing in the file LICENSE.GPL2 and LICENSE.GPL3 |
32 | ** included in the packaging of this file. Please review the following |
33 | ** information to ensure the GNU General Public License requirements will |
34 | ** be met: https://www.gnu.org/licenses/gpl-2.0.html and |
35 | ** https://www.gnu.org/licenses/gpl-3.0.html. |
36 | ** |
37 | ** $QT_END_LICENSE$ |
38 | ** |
39 | ****************************************************************************/ |
40 | |
41 | /*! |
42 | \class QUrl |
43 | \inmodule QtCore |
44 | |
45 | \brief The QUrl class provides a convenient interface for working |
46 | with URLs. |
47 | |
48 | \reentrant |
49 | \ingroup io |
50 | \ingroup network |
51 | \ingroup shared |
52 | |
53 | |
54 | It can parse and construct URLs in both encoded and unencoded |
55 | form. QUrl also has support for internationalized domain names |
56 | (IDNs). |
57 | |
58 | The most common way to use QUrl is to initialize it via the |
59 | constructor by passing a QString. Otherwise, setUrl() can also |
60 | be used. |
61 | |
62 | URLs can be represented in two forms: encoded or unencoded. The |
63 | unencoded representation is suitable for showing to users, but |
64 | the encoded representation is typically what you would send to |
65 | a web server. For example, the unencoded URL |
66 | "http://bühler.example.com/List of applicants.xml" |
67 | would be sent to the server as |
68 | "http://xn--bhler-kva.example.com/List%20of%20applicants.xml". |
69 | |
70 | A URL can also be constructed piece by piece by calling |
71 | setScheme(), setUserName(), setPassword(), setHost(), setPort(), |
72 | setPath(), setQuery() and setFragment(). Some convenience |
73 | functions are also available: setAuthority() sets the user name, |
74 | password, host and port. setUserInfo() sets the user name and |
75 | password at once. |
76 | |
77 | Call isValid() to check if the URL is valid. This can be done at any point |
78 | during the constructing of a URL. If isValid() returns \c false, you should |
79 | clear() the URL before proceeding, or start over by parsing a new URL with |
80 | setUrl(). |
81 | |
82 | Constructing a query is particularly convenient through the use of the \l |
83 | QUrlQuery class and its methods QUrlQuery::setQueryItems(), |
84 | QUrlQuery::addQueryItem() and QUrlQuery::removeQueryItem(). Use |
85 | QUrlQuery::setQueryDelimiters() to customize the delimiters used for |
86 | generating the query string. |
87 | |
88 | For the convenience of generating encoded URL strings or query |
89 | strings, there are two static functions called |
90 | fromPercentEncoding() and toPercentEncoding() which deal with |
91 | percent encoding and decoding of QString objects. |
92 | |
93 | fromLocalFile() constructs a QUrl by parsing a local |
94 | file path. toLocalFile() converts a URL to a local file path. |
95 | |
96 | The human readable representation of the URL is fetched with |
97 | toString(). This representation is appropriate for displaying a |
98 | URL to a user in unencoded form. The encoded form however, as |
99 | returned by toEncoded(), is for internal use, passing to web |
100 | servers, mail clients and so on. Both forms are technically correct |
101 | and represent the same URL unambiguously -- in fact, passing either |
102 | form to QUrl's constructor or to setUrl() will yield the same QUrl |
103 | object. |
104 | |
105 | QUrl conforms to the URI specification from |
106 | \l{RFC 3986} (Uniform Resource Identifier: Generic Syntax), and includes |
107 | scheme extensions from \l{RFC 1738} (Uniform Resource Locators). Case |
108 | folding rules in QUrl conform to \l{RFC 3491} (Nameprep: A Stringprep |
109 | Profile for Internationalized Domain Names (IDN)). It is also compatible with the |
110 | \l{http://freedesktop.org/wiki/Specifications/file-uri-spec/}{file URI specification} |
111 | from freedesktop.org, provided that the locale encodes file names using |
112 | UTF-8 (required by IDN). |
113 | |
114 | \section2 Relative URLs vs Relative Paths |
115 | |
116 | Calling isRelative() will return whether or not the URL is relative. |
117 | A relative URL has no \l {scheme}. For example: |
118 | |
119 | \snippet code/src_corelib_io_qurl.cpp 8 |
120 | |
121 | Notice that a URL can be absolute while containing a relative path, and |
122 | vice versa: |
123 | |
124 | \snippet code/src_corelib_io_qurl.cpp 9 |
125 | |
126 | A relative URL can be resolved by passing it as an argument to resolved(), |
127 | which returns an absolute URL. isParentOf() is used for determining whether |
128 | one URL is a parent of another. |
129 | |
130 | \section2 Error checking |
131 | |
132 | QUrl is capable of detecting many errors in URLs while parsing it or when |
133 | components of the URL are set with individual setter methods (like |
134 | setScheme(), setHost() or setPath()). If the parsing or setter function is |
135 | successful, any previously recorded error conditions will be discarded. |
136 | |
137 | By default, QUrl setter methods operate in QUrl::TolerantMode, which means |
138 | they accept some common mistakes and mis-representation of data. An |
139 | alternate method of parsing is QUrl::StrictMode, which applies further |
140 | checks. See QUrl::ParsingMode for a description of the difference of the |
141 | parsing modes. |
142 | |
143 | QUrl only checks for conformance with the URL specification. It does not |
144 | try to verify that high-level protocol URLs are in the format they are |
145 | expected to be by handlers elsewhere. For example, the following URIs are |
146 | all considered valid by QUrl, even if they do not make sense when used: |
147 | |
148 | \list |
149 | \li "http:/filename.html" |
150 | \li "mailto://example.com" |
151 | \endlist |
152 | |
153 | When the parser encounters an error, it signals the event by making |
154 | isValid() return false and toString() / toEncoded() return an empty string. |
155 | If it is necessary to show the user the reason why the URL failed to parse, |
156 | the error condition can be obtained from QUrl by calling errorString(). |
157 | Note that this message is highly technical and may not make sense to |
158 | end-users. |
159 | |
160 | QUrl is capable of recording only one error condition. If more than one |
161 | error is found, it is undefined which error is reported. |
162 | |
163 | \section2 Character Conversions |
164 | |
165 | Follow these rules to avoid erroneous character conversion when |
166 | dealing with URLs and strings: |
167 | |
168 | \list |
169 | \li When creating a QString to contain a URL from a QByteArray or a |
170 | char*, always use QString::fromUtf8(). |
171 | \endlist |
172 | */ |
173 | |
174 | /*! |
175 | \enum QUrl::ParsingMode |
176 | |
177 | The parsing mode controls the way QUrl parses strings. |
178 | |
179 | \value TolerantMode QUrl will try to correct some common errors in URLs. |
180 | This mode is useful for parsing URLs coming from sources |
181 | not known to be strictly standards-conforming. |
182 | |
183 | \value StrictMode Only valid URLs are accepted. This mode is useful for |
184 | general URL validation. |
185 | |
186 | \value DecodedMode QUrl will interpret the URL component in the fully-decoded form, |
187 | where percent characters stand for themselves, not as the beginning |
188 | of a percent-encoded sequence. This mode is only valid for the |
189 | setters setting components of a URL; it is not permitted in |
190 | the QUrl constructor, in fromEncoded() or in setUrl(). |
191 | For more information on this mode, see the documentation for |
192 | \l {QUrl::ComponentFormattingOption}{QUrl::FullyDecoded}. |
193 | |
194 | In TolerantMode, the parser has the following behaviour: |
195 | |
196 | \list |
197 | |
198 | \li Spaces and "%20": unencoded space characters will be accepted and will |
199 | be treated as equivalent to "%20". |
200 | |
201 | \li Single "%" characters: Any occurrences of a percent character "%" not |
202 | followed by exactly two hexadecimal characters (e.g., "13% coverage.html") |
203 | will be replaced by "%25". Note that one lone "%" character will trigger |
204 | the correction mode for all percent characters. |
205 | |
206 | \li Reserved and unreserved characters: An encoded URL should only |
207 | contain a few characters as literals; all other characters should |
208 | be percent-encoded. In TolerantMode, these characters will be |
209 | accepted if they are found in the URL: |
210 | space / double-quote / "<" / ">" / "\" / |
211 | "^" / "`" / "{" / "|" / "}" |
212 | Those same characters can be decoded again by passing QUrl::DecodeReserved |
213 | to toString() or toEncoded(). In the getters of individual components, |
214 | those characters are often returned in decoded form. |
215 | |
216 | \endlist |
217 | |
218 | When in StrictMode, if a parsing error is found, isValid() will return \c |
219 | false and errorString() will return a message describing the error. |
220 | If more than one error is detected, it is undefined which error gets |
221 | reported. |
222 | |
223 | Note that TolerantMode is not usually enough for parsing user input, which |
224 | often contains more errors and expectations than the parser can deal with. |
225 | When dealing with data coming directly from the user -- as opposed to data |
226 | coming from data-transfer sources, such as other programs -- it is |
227 | recommended to use fromUserInput(). |
228 | |
229 | \sa fromUserInput(), setUrl(), toString(), toEncoded(), QUrl::FormattingOptions |
230 | */ |
231 | |
232 | /*! |
233 | \enum QUrl::UrlFormattingOption |
234 | |
235 | The formatting options define how the URL is formatted when written out |
236 | as text. |
237 | |
238 | \value None The format of the URL is unchanged. |
239 | \value RemoveScheme The scheme is removed from the URL. |
240 | \value RemovePassword Any password in the URL is removed. |
241 | \value RemoveUserInfo Any user information in the URL is removed. |
242 | \value RemovePort Any specified port is removed from the URL. |
243 | \value RemoveAuthority |
244 | \value RemovePath The URL's path is removed, leaving only the scheme, |
245 | host address, and port (if present). |
246 | \value RemoveQuery The query part of the URL (following a '?' character) |
247 | is removed. |
248 | \value RemoveFragment |
249 | \value RemoveFilename The filename (i.e. everything after the last '/' in the path) is removed. |
250 | The trailing '/' is kept, unless StripTrailingSlash is set. |
251 | Only valid if RemovePath is not set. |
252 | \value PreferLocalFile If the URL is a local file according to isLocalFile() |
253 | and contains no query or fragment, a local file path is returned. |
254 | \value StripTrailingSlash The trailing slash is removed from the path, if one is present. |
255 | \value NormalizePathSegments Modifies the path to remove redundant directory separators, |
256 | and to resolve "."s and ".."s (as far as possible). For non-local paths, adjacent |
257 | slashes are preserved. |
258 | |
259 | Note that the case folding rules in \l{RFC 3491}{Nameprep}, which QUrl |
260 | conforms to, require host names to always be converted to lower case, |
261 | regardless of the Qt::FormattingOptions used. |
262 | |
263 | The options from QUrl::ComponentFormattingOptions are also possible. |
264 | |
265 | \sa QUrl::ComponentFormattingOptions |
266 | */ |
267 | |
268 | /*! |
269 | \enum QUrl::ComponentFormattingOption |
270 | \since 5.0 |
271 | |
272 | The component formatting options define how the components of an URL will |
273 | be formatted when written out as text. They can be combined with the |
274 | options from QUrl::FormattingOptions when used in toString() and |
275 | toEncoded(). |
276 | |
277 | \value PrettyDecoded The component is returned in a "pretty form", with |
278 | most percent-encoded characters decoded. The exact |
279 | behavior of PrettyDecoded varies from component to |
280 | component and may also change from Qt release to Qt |
281 | release. This is the default. |
282 | |
283 | \value EncodeSpaces Leave space characters in their encoded form ("%20"). |
284 | |
285 | \value EncodeUnicode Leave non-US-ASCII characters encoded in their UTF-8 |
286 | percent-encoded form (e.g., "%C3%A9" for the U+00E9 |
287 | codepoint, LATIN SMALL LETTER E WITH ACUTE). |
288 | |
289 | \value EncodeDelimiters Leave certain delimiters in their encoded form, as |
290 | would appear in the URL when the full URL is |
291 | represented as text. The delimiters are affected |
292 | by this option change from component to component. |
293 | This flag has no effect in toString() or toEncoded(). |
294 | |
295 | \value EncodeReserved Leave US-ASCII characters not permitted in the URL by |
296 | the specification in their encoded form. This is the |
297 | default on toString() and toEncoded(). |
298 | |
299 | \value DecodeReserved Decode the US-ASCII characters that the URL specification |
300 | does not allow to appear in the URL. This is the |
301 | default on the getters of individual components. |
302 | |
303 | \value FullyEncoded Leave all characters in their properly-encoded form, |
304 | as this component would appear as part of a URL. When |
305 | used with toString(), this produces a fully-compliant |
306 | URL in QString form, exactly equal to the result of |
307 | toEncoded() |
308 | |
309 | \value FullyDecoded Attempt to decode as much as possible. For individual |
310 | components of the URL, this decodes every percent |
311 | encoding sequence, including control characters (U+0000 |
312 | to U+001F) and UTF-8 sequences found in percent-encoded form. |
313 | Use of this mode may cause data loss, see below for more information. |
314 | |
315 | The values of EncodeReserved and DecodeReserved should not be used together |
316 | in one call. The behavior is undefined if that happens. They are provided |
317 | as separate values because the behavior of the "pretty mode" with regards |
318 | to reserved characters is different on certain components and specially on |
319 | the full URL. |
320 | |
321 | \section2 Full decoding |
322 | |
323 | The FullyDecoded mode is similar to the behavior of the functions returning |
324 | QString in Qt 4.x, in that every character represents itself and never has |
325 | any special meaning. This is true even for the percent character ('%'), |
326 | which should be interpreted to mean a literal percent, not the beginning of |
327 | a percent-encoded sequence. The same actual character, in all other |
328 | decoding modes, is represented by the sequence "%25". |
329 | |
330 | Whenever re-applying data obtained with QUrl::FullyDecoded into a QUrl, |
331 | care must be taken to use the QUrl::DecodedMode parameter to the setters |
332 | (like setPath() and setUserName()). Failure to do so may cause |
333 | re-interpretation of the percent character ('%') as the beginning of a |
334 | percent-encoded sequence. |
335 | |
336 | This mode is quite useful when portions of a URL are used in a non-URL |
337 | context. For example, to extract the username, password or file paths in an |
338 | FTP client application, the FullyDecoded mode should be used. |
339 | |
340 | This mode should be used with care, since there are two conditions that |
341 | cannot be reliably represented in the returned QString. They are: |
342 | |
343 | \list |
344 | \li \b{Non-UTF-8 sequences:} URLs may contain sequences of |
345 | percent-encoded characters that do not form valid UTF-8 sequences. Since |
346 | URLs need to be decoded using UTF-8, any decoder failure will result in |
347 | the QString containing one or more replacement characters where the |
348 | sequence existed. |
349 | |
350 | \li \b{Encoded delimiters:} URLs are also allowed to make a distinction |
351 | between a delimiter found in its literal form and its equivalent in |
352 | percent-encoded form. This is most commonly found in the query, but is |
353 | permitted in most parts of the URL. |
354 | \endlist |
355 | |
356 | The following example illustrates the problem: |
357 | |
358 | \snippet code/src_corelib_io_qurl.cpp 10 |
359 | |
360 | If the two URLs were used via HTTP GET, the interpretation by the web |
361 | server would probably be different. In the first case, it would interpret |
362 | as one parameter, with a key of "q" and value "a+=b&c". In the second |
363 | case, it would probably interpret as two parameters, one with a key of "q" |
364 | and value "a =b", and the second with a key "c" and no value. |
365 | |
366 | \sa QUrl::FormattingOptions |
367 | */ |
368 | |
369 | /*! |
370 | \enum QUrl::UserInputResolutionOption |
371 | \since 5.4 |
372 | |
373 | The user input resolution options define how fromUserInput() should |
374 | interpret strings that could either be a relative path or the short |
375 | form of a HTTP URL. For instance \c{file.pl} can be either a local file |
376 | or the URL \c{http://file.pl}. |
377 | |
378 | \value DefaultResolution The default resolution mechanism is to check |
379 | whether a local file exists, in the working |
380 | directory given to fromUserInput, and only |
381 | return a local path in that case. Otherwise a URL |
382 | is assumed. |
383 | \value AssumeLocalFile This option makes fromUserInput() always return |
384 | a local path unless the input contains a scheme, such as |
385 | \c{http://file.pl}. This is useful for applications |
386 | such as text editors, which are able to create |
387 | the file if it doesn't exist. |
388 | |
389 | \sa fromUserInput() |
390 | */ |
391 | |
392 | /*! |
393 | \fn QUrl::QUrl(QUrl &&other) |
394 | |
395 | Move-constructs a QUrl instance, making it point at the same |
396 | object that \a other was pointing to. |
397 | |
398 | \since 5.2 |
399 | */ |
400 | |
401 | /*! |
402 | \fn QUrl &QUrl::operator=(QUrl &&other) |
403 | |
404 | Move-assigns \a other to this QUrl instance. |
405 | |
406 | \since 5.2 |
407 | */ |
408 | |
409 | #include "qurl.h" |
410 | #include "qurl_p.h" |
411 | #include "qplatformdefs.h" |
412 | #include "qstring.h" |
413 | #include "qstringlist.h" |
414 | #include "qdebug.h" |
415 | #include "qhash.h" |
416 | #include "qdir.h" // for QDir::fromNativeSeparators |
417 | #include "qdatastream.h" |
418 | #include "private/qipaddress_p.h" |
419 | #include "qurlquery.h" |
420 | #include "private/qdir_p.h" |
421 | #include <private/qmemory_p.h> |
422 | |
423 | QT_BEGIN_NAMESPACE |
424 | |
425 | // in qstring.cpp: |
426 | void qt_from_latin1(char16_t *dst, const char *str, size_t size) noexcept; |
427 | |
428 | inline static bool isHex(char c) |
429 | { |
430 | c |= 0x20; |
431 | return (c >= '0' && c <= '9') || (c >= 'a' && c <= 'f'); |
432 | } |
433 | |
434 | static inline QString ftpScheme() |
435 | { |
436 | return QStringLiteral("ftp" ); |
437 | } |
438 | |
439 | static inline QString fileScheme() |
440 | { |
441 | return QStringLiteral("file" ); |
442 | } |
443 | |
444 | static inline QString webDavScheme() |
445 | { |
446 | return QStringLiteral("webdavs" ); |
447 | } |
448 | |
449 | static inline QString webDavSslTag() |
450 | { |
451 | return QStringLiteral("@SSL" ); |
452 | } |
453 | |
454 | class QUrlPrivate |
455 | { |
456 | public: |
457 | enum Section : uchar { |
458 | Scheme = 0x01, |
459 | UserName = 0x02, |
460 | Password = 0x04, |
461 | UserInfo = UserName | Password, |
462 | Host = 0x08, |
463 | Port = 0x10, |
464 | Authority = UserInfo | Host | Port, |
465 | Path = 0x20, |
466 | Hierarchy = Authority | Path, |
467 | Query = 0x40, |
468 | Fragment = 0x80, |
469 | FullUrl = 0xff |
470 | }; |
471 | |
472 | enum Flags : uchar { |
473 | IsLocalFile = 0x01 |
474 | }; |
475 | |
476 | enum ErrorCode { |
477 | // the high byte of the error code matches the Section |
478 | // the first item in each value must be the generic "Invalid xxx Error" |
479 | InvalidSchemeError = Scheme << 8, |
480 | |
481 | InvalidUserNameError = UserName << 8, |
482 | |
483 | InvalidPasswordError = Password << 8, |
484 | |
485 | InvalidRegNameError = Host << 8, |
486 | InvalidIPv4AddressError, |
487 | InvalidIPv6AddressError, |
488 | InvalidCharacterInIPv6Error, |
489 | InvalidIPvFutureError, |
490 | HostMissingEndBracket, |
491 | |
492 | InvalidPortError = Port << 8, |
493 | PortEmptyError, |
494 | |
495 | InvalidPathError = Path << 8, |
496 | |
497 | InvalidQueryError = Query << 8, |
498 | |
499 | InvalidFragmentError = Fragment << 8, |
500 | |
501 | // the following three cases are only possible in combination with |
502 | // presence/absence of the path, authority and scheme. See validityError(). |
503 | AuthorityPresentAndPathIsRelative = Authority << 8 | Path << 8 | 0x10000, |
504 | AuthorityAbsentAndPathIsDoubleSlash, |
505 | RelativeUrlPathContainsColonBeforeSlash = Scheme << 8 | Authority << 8 | Path << 8 | 0x10000, |
506 | |
507 | NoError = 0 |
508 | }; |
509 | |
510 | struct Error { |
511 | QString source; |
512 | ErrorCode code; |
513 | int position; |
514 | }; |
515 | |
516 | QUrlPrivate(); |
517 | QUrlPrivate(const QUrlPrivate ©); |
518 | ~QUrlPrivate(); |
519 | |
520 | void parse(const QString &url, QUrl::ParsingMode parsingMode); |
521 | bool isEmpty() const |
522 | { return sectionIsPresent == 0 && port == -1 && path.isEmpty(); } |
523 | |
524 | std::unique_ptr<Error> cloneError() const; |
525 | void clearError(); |
526 | void setError(ErrorCode errorCode, const QString &source, int supplement = -1); |
527 | ErrorCode validityError(QString *source = nullptr, int *position = nullptr) const; |
528 | bool validateComponent(Section section, const QString &input, int begin, int end); |
529 | bool validateComponent(Section section, const QString &input) |
530 | { return validateComponent(section, input, 0, uint(input.length())); } |
531 | |
532 | // no QString scheme() const; |
533 | void appendAuthority(QString &appendTo, QUrl::FormattingOptions options, Section appendingTo) const; |
534 | void appendUserInfo(QString &appendTo, QUrl::FormattingOptions options, Section appendingTo) const; |
535 | void appendUserName(QString &appendTo, QUrl::FormattingOptions options) const; |
536 | void appendPassword(QString &appendTo, QUrl::FormattingOptions options) const; |
537 | void appendHost(QString &appendTo, QUrl::FormattingOptions options) const; |
538 | void appendPath(QString &appendTo, QUrl::FormattingOptions options, Section appendingTo) const; |
539 | void appendQuery(QString &appendTo, QUrl::FormattingOptions options, Section appendingTo) const; |
540 | void appendFragment(QString &appendTo, QUrl::FormattingOptions options, Section appendingTo) const; |
541 | |
542 | // the "end" parameters are like STL iterators: they point to one past the last valid element |
543 | bool setScheme(const QString &value, int len, bool doSetError); |
544 | void setAuthority(const QString &auth, int from, int end, QUrl::ParsingMode mode); |
545 | void setUserInfo(const QString &userInfo, int from, int end); |
546 | void setUserName(const QString &value, int from, int end); |
547 | void setPassword(const QString &value, int from, int end); |
548 | bool setHost(const QString &value, int from, int end, QUrl::ParsingMode mode); |
549 | void setPath(const QString &value, int from, int end); |
550 | void setQuery(const QString &value, int from, int end); |
551 | void setFragment(const QString &value, int from, int end); |
552 | |
553 | inline bool hasScheme() const { return sectionIsPresent & Scheme; } |
554 | inline bool hasAuthority() const { return sectionIsPresent & Authority; } |
555 | inline bool hasUserInfo() const { return sectionIsPresent & UserInfo; } |
556 | inline bool hasUserName() const { return sectionIsPresent & UserName; } |
557 | inline bool hasPassword() const { return sectionIsPresent & Password; } |
558 | inline bool hasHost() const { return sectionIsPresent & Host; } |
559 | inline bool hasPort() const { return port != -1; } |
560 | inline bool hasPath() const { return !path.isEmpty(); } |
561 | inline bool hasQuery() const { return sectionIsPresent & Query; } |
562 | inline bool hasFragment() const { return sectionIsPresent & Fragment; } |
563 | |
564 | inline bool isLocalFile() const { return flags & IsLocalFile; } |
565 | QString toLocalFile(QUrl::FormattingOptions options) const; |
566 | |
567 | QString mergePaths(const QString &relativePath) const; |
568 | |
569 | QAtomicInt ref; |
570 | int port; |
571 | |
572 | QString scheme; |
573 | QString userName; |
574 | QString password; |
575 | QString host; |
576 | QString path; |
577 | QString query; |
578 | QString fragment; |
579 | |
580 | std::unique_ptr<Error> error; |
581 | |
582 | // not used for: |
583 | // - Port (port == -1 means absence) |
584 | // - Path (there's no path delimiter, so we optimize its use out of existence) |
585 | // Schemes are never supposed to be empty, but we keep the flag anyway |
586 | uchar sectionIsPresent; |
587 | uchar flags; |
588 | |
589 | // 32-bit: 2 bytes tail padding available |
590 | // 64-bit: 6 bytes tail padding available |
591 | }; |
592 | |
593 | inline QUrlPrivate::QUrlPrivate() |
594 | : ref(1), port(-1), |
595 | sectionIsPresent(0), |
596 | flags(0) |
597 | { |
598 | } |
599 | |
600 | inline QUrlPrivate::QUrlPrivate(const QUrlPrivate ©) |
601 | : ref(1), port(copy.port), |
602 | scheme(copy.scheme), |
603 | userName(copy.userName), |
604 | password(copy.password), |
605 | host(copy.host), |
606 | path(copy.path), |
607 | query(copy.query), |
608 | fragment(copy.fragment), |
609 | error(copy.cloneError()), |
610 | sectionIsPresent(copy.sectionIsPresent), |
611 | flags(copy.flags) |
612 | { |
613 | } |
614 | |
615 | inline QUrlPrivate::~QUrlPrivate() |
616 | = default; |
617 | |
618 | std::unique_ptr<QUrlPrivate::Error> QUrlPrivate::cloneError() const |
619 | { |
620 | return error ? qt_make_unique<Error>(*error) : nullptr; |
621 | } |
622 | |
623 | inline void QUrlPrivate::clearError() |
624 | { |
625 | error.reset(); |
626 | } |
627 | |
628 | inline void QUrlPrivate::setError(ErrorCode errorCode, const QString &source, int supplement) |
629 | { |
630 | if (error) { |
631 | // don't overwrite an error set in a previous section during parsing |
632 | return; |
633 | } |
634 | error = qt_make_unique<Error>(); |
635 | error->code = errorCode; |
636 | error->source = source; |
637 | error->position = supplement; |
638 | } |
639 | |
640 | // From RFC 3986, Appendix A Collected ABNF for URI |
641 | // URI = scheme ":" hier-part [ "?" query ] [ "#" fragment ] |
642 | //[...] |
643 | // scheme = ALPHA *( ALPHA / DIGIT / "+" / "-" / "." ) |
644 | // |
645 | // authority = [ userinfo "@" ] host [ ":" port ] |
646 | // userinfo = *( unreserved / pct-encoded / sub-delims / ":" ) |
647 | // host = IP-literal / IPv4address / reg-name |
648 | // port = *DIGIT |
649 | //[...] |
650 | // reg-name = *( unreserved / pct-encoded / sub-delims ) |
651 | //[..] |
652 | // pchar = unreserved / pct-encoded / sub-delims / ":" / "@" |
653 | // |
654 | // query = *( pchar / "/" / "?" ) |
655 | // |
656 | // fragment = *( pchar / "/" / "?" ) |
657 | // |
658 | // pct-encoded = "%" HEXDIG HEXDIG |
659 | // |
660 | // unreserved = ALPHA / DIGIT / "-" / "." / "_" / "~" |
661 | // reserved = gen-delims / sub-delims |
662 | // gen-delims = ":" / "/" / "?" / "#" / "[" / "]" / "@" |
663 | // sub-delims = "!" / "$" / "&" / "'" / "(" / ")" |
664 | // / "*" / "+" / "," / ";" / "=" |
665 | // the path component has a complex ABNF that basically boils down to |
666 | // slash-separated segments of "pchar" |
667 | |
668 | // The above is the strict definition of the URL components and we mostly |
669 | // adhere to it, with few exceptions. QUrl obeys the following behavior: |
670 | // - percent-encoding sequences always use uppercase HEXDIG; |
671 | // - unreserved characters are *always* decoded, no exceptions; |
672 | // - the space character and bytes with the high bit set are controlled by |
673 | // the EncodeSpaces and EncodeUnicode bits; |
674 | // - control characters, the percent sign itself, and bytes with the high |
675 | // bit set that don't form valid UTF-8 sequences are always encoded, |
676 | // except in FullyDecoded mode; |
677 | // - sub-delims are always left alone, except in FullyDecoded mode; |
678 | // - gen-delim change behavior depending on which section of the URL (or |
679 | // the entire URL) we're looking at; see below; |
680 | // - characters not mentioned above, like "<", and ">", are usually |
681 | // decoded in individual sections of the URL, but encoded when the full |
682 | // URL is put together (we can change on subjective definition of |
683 | // "pretty"). |
684 | // |
685 | // The behavior for the delimiters bears some explanation. The spec says in |
686 | // section 2.2: |
687 | // URIs that differ in the replacement of a reserved character with its |
688 | // corresponding percent-encoded octet are not equivalent. |
689 | // (note: QUrl API mistakenly uses the "reserved" term, so we will refer to |
690 | // them here as "delimiters"). |
691 | // |
692 | // For that reason, we cannot encode delimiters found in decoded form and we |
693 | // cannot decode the ones found in encoded form if that would change the |
694 | // interpretation. Conversely, we *can* perform the transformation if it would |
695 | // not change the interpretation. From the last component of a URL to the first, |
696 | // here are the gen-delims we can unambiguously transform when the field is |
697 | // taken in isolation: |
698 | // - fragment: none, since it's the last |
699 | // - query: "#" is unambiguous |
700 | // - path: "#" and "?" are unambiguous |
701 | // - host: completely special but never ambiguous, see setHost() below. |
702 | // - password: the "#", "?", "/", "[", "]" and "@" characters are unambiguous |
703 | // - username: the "#", "?", "/", "[", "]", "@", and ":" characters are unambiguous |
704 | // - scheme: doesn't accept any delimiter, see setScheme() below. |
705 | // |
706 | // Internally, QUrl stores each component in the format that corresponds to the |
707 | // default mode (PrettyDecoded). It deviates from the "strict" FullyEncoded |
708 | // mode in the following way: |
709 | // - spaces are decoded |
710 | // - valid UTF-8 sequences are decoded |
711 | // - gen-delims that can be unambiguously transformed are decoded |
712 | // - characters controlled by DecodeReserved are often decoded, though this behavior |
713 | // can change depending on the subjective definition of "pretty" |
714 | // |
715 | // Note that the list of gen-delims that we can transform is different for the |
716 | // user info (user name + password) and the authority (user info + host + |
717 | // port). |
718 | |
719 | |
720 | // list the recoding table modifications to be used with the recodeFromUser and |
721 | // appendToUser functions, according to the rules above. Spaces and UTF-8 |
722 | // sequences are handled outside the tables. |
723 | |
724 | // the encodedXXX tables are run with the delimiters set to "leave" by default; |
725 | // the decodedXXX tables are run with the delimiters set to "decode" by default |
726 | // (except for the query, which doesn't use these functions) |
727 | |
728 | namespace { |
729 | template <typename T> constexpr ushort decode(T x) noexcept { return ushort(x); } |
730 | template <typename T> constexpr ushort leave(T x) noexcept { return ushort(0x100 | x); } |
731 | template <typename T> constexpr ushort encode(T x) noexcept { return ushort(0x200 | x); } |
732 | } |
733 | |
734 | static const ushort userNameInIsolation[] = { |
735 | decode(':'), // 0 |
736 | decode('@'), // 1 |
737 | decode(']'), // 2 |
738 | decode('['), // 3 |
739 | decode('/'), // 4 |
740 | decode('?'), // 5 |
741 | decode('#'), // 6 |
742 | |
743 | decode('"'), // 7 |
744 | decode('<'), |
745 | decode('>'), |
746 | decode('^'), |
747 | decode('\\'), |
748 | decode('|'), |
749 | decode('{'), |
750 | decode('}'), |
751 | 0 |
752 | }; |
753 | static const ushort * const passwordInIsolation = userNameInIsolation + 1; |
754 | static const ushort * const pathInIsolation = userNameInIsolation + 5; |
755 | static const ushort * const queryInIsolation = userNameInIsolation + 6; |
756 | static const ushort * const fragmentInIsolation = userNameInIsolation + 7; |
757 | |
758 | static const ushort userNameInUserInfo[] = { |
759 | encode(':'), // 0 |
760 | decode('@'), // 1 |
761 | decode(']'), // 2 |
762 | decode('['), // 3 |
763 | decode('/'), // 4 |
764 | decode('?'), // 5 |
765 | decode('#'), // 6 |
766 | |
767 | decode('"'), // 7 |
768 | decode('<'), |
769 | decode('>'), |
770 | decode('^'), |
771 | decode('\\'), |
772 | decode('|'), |
773 | decode('{'), |
774 | decode('}'), |
775 | 0 |
776 | }; |
777 | static const ushort * const passwordInUserInfo = userNameInUserInfo + 1; |
778 | |
779 | static const ushort userNameInAuthority[] = { |
780 | encode(':'), // 0 |
781 | encode('@'), // 1 |
782 | encode(']'), // 2 |
783 | encode('['), // 3 |
784 | decode('/'), // 4 |
785 | decode('?'), // 5 |
786 | decode('#'), // 6 |
787 | |
788 | decode('"'), // 7 |
789 | decode('<'), |
790 | decode('>'), |
791 | decode('^'), |
792 | decode('\\'), |
793 | decode('|'), |
794 | decode('{'), |
795 | decode('}'), |
796 | 0 |
797 | }; |
798 | static const ushort * const passwordInAuthority = userNameInAuthority + 1; |
799 | |
800 | static const ushort userNameInUrl[] = { |
801 | encode(':'), // 0 |
802 | encode('@'), // 1 |
803 | encode(']'), // 2 |
804 | encode('['), // 3 |
805 | encode('/'), // 4 |
806 | encode('?'), // 5 |
807 | encode('#'), // 6 |
808 | |
809 | // no need to list encode(x) for the other characters |
810 | 0 |
811 | }; |
812 | static const ushort * const passwordInUrl = userNameInUrl + 1; |
813 | static const ushort * const pathInUrl = userNameInUrl + 5; |
814 | static const ushort * const queryInUrl = userNameInUrl + 6; |
815 | static const ushort * const fragmentInUrl = userNameInUrl + 6; |
816 | |
817 | static inline void parseDecodedComponent(QString &data) |
818 | { |
819 | data.replace(QLatin1Char('%'), QLatin1String("%25" )); |
820 | } |
821 | |
822 | static inline QString |
823 | recodeFromUser(const QString &input, const ushort *actions, int from, int to) |
824 | { |
825 | QString output; |
826 | const QChar *begin = input.constData() + from; |
827 | const QChar *end = input.constData() + to; |
828 | if (qt_urlRecode(output, QStringView{begin, end}, {}, actions)) |
829 | return output; |
830 | |
831 | return input.mid(from, to - from); |
832 | } |
833 | |
834 | // appendXXXX functions: copy from the internal form to the external, user form. |
835 | // the internal value is stored in its PrettyDecoded form, so that case is easy. |
836 | static inline void appendToUser(QString &appendTo, QStringView value, QUrl::FormattingOptions options, |
837 | const ushort *actions) |
838 | { |
839 | // Test ComponentFormattingOptions, ignore FormattingOptions. |
840 | if ((options & 0xFFFF0000) == QUrl::PrettyDecoded) { |
841 | appendTo += value; |
842 | return; |
843 | } |
844 | |
845 | if (!qt_urlRecode(appendTo, value, options, actions)) |
846 | appendTo += value; |
847 | } |
848 | |
849 | inline void QUrlPrivate::appendAuthority(QString &appendTo, QUrl::FormattingOptions options, Section appendingTo) const |
850 | { |
851 | if ((options & QUrl::RemoveUserInfo) != QUrl::RemoveUserInfo) { |
852 | appendUserInfo(appendTo, options, appendingTo); |
853 | |
854 | // add '@' only if we added anything |
855 | if (hasUserName() || (hasPassword() && (options & QUrl::RemovePassword) == 0)) |
856 | appendTo += QLatin1Char('@'); |
857 | } |
858 | appendHost(appendTo, options); |
859 | if (!(options & QUrl::RemovePort) && port != -1) |
860 | appendTo += QLatin1Char(':') + QString::number(port); |
861 | } |
862 | |
863 | inline void QUrlPrivate::appendUserInfo(QString &appendTo, QUrl::FormattingOptions options, Section appendingTo) const |
864 | { |
865 | if (Q_LIKELY(!hasUserInfo())) |
866 | return; |
867 | |
868 | const ushort *userNameActions; |
869 | const ushort *passwordActions; |
870 | if (options & QUrl::EncodeDelimiters) { |
871 | userNameActions = userNameInUrl; |
872 | passwordActions = passwordInUrl; |
873 | } else { |
874 | switch (appendingTo) { |
875 | case UserInfo: |
876 | userNameActions = userNameInUserInfo; |
877 | passwordActions = passwordInUserInfo; |
878 | break; |
879 | |
880 | case Authority: |
881 | userNameActions = userNameInAuthority; |
882 | passwordActions = passwordInAuthority; |
883 | break; |
884 | |
885 | case FullUrl: |
886 | userNameActions = userNameInUrl; |
887 | passwordActions = passwordInUrl; |
888 | break; |
889 | |
890 | default: |
891 | // can't happen |
892 | Q_UNREACHABLE(); |
893 | break; |
894 | } |
895 | } |
896 | |
897 | if (!qt_urlRecode(appendTo, userName, options, userNameActions)) |
898 | appendTo += userName; |
899 | if (options & QUrl::RemovePassword || !hasPassword()) { |
900 | return; |
901 | } else { |
902 | appendTo += QLatin1Char(':'); |
903 | if (!qt_urlRecode(appendTo, password, options, passwordActions)) |
904 | appendTo += password; |
905 | } |
906 | } |
907 | |
908 | inline void QUrlPrivate::appendUserName(QString &appendTo, QUrl::FormattingOptions options) const |
909 | { |
910 | // only called from QUrl::userName() |
911 | appendToUser(appendTo, userName, options, |
912 | options & QUrl::EncodeDelimiters ? userNameInUrl : userNameInIsolation); |
913 | } |
914 | |
915 | inline void QUrlPrivate::appendPassword(QString &appendTo, QUrl::FormattingOptions options) const |
916 | { |
917 | // only called from QUrl::password() |
918 | appendToUser(appendTo, password, options, |
919 | options & QUrl::EncodeDelimiters ? passwordInUrl : passwordInIsolation); |
920 | } |
921 | |
922 | inline void QUrlPrivate::appendPath(QString &appendTo, QUrl::FormattingOptions options, Section appendingTo) const |
923 | { |
924 | QString thePath = path; |
925 | if (options & QUrl::NormalizePathSegments) { |
926 | thePath = qt_normalizePathSegments(path, isLocalFile() ? QDirPrivate::DefaultNormalization : QDirPrivate::RemotePath); |
927 | } |
928 | |
929 | QStringView thePathView(thePath); |
930 | if (options & QUrl::RemoveFilename) { |
931 | const int slash = path.lastIndexOf(QLatin1Char('/')); |
932 | if (slash == -1) |
933 | return; |
934 | thePathView = QStringView{path}.left(slash + 1); |
935 | } |
936 | // check if we need to remove trailing slashes |
937 | if (options & QUrl::StripTrailingSlash) { |
938 | while (thePathView.length() > 1 && thePathView.endsWith(QLatin1Char('/'))) |
939 | thePathView.chop(1); |
940 | } |
941 | |
942 | appendToUser(appendTo, thePathView, options, |
943 | appendingTo == FullUrl || options & QUrl::EncodeDelimiters ? pathInUrl : pathInIsolation); |
944 | } |
945 | |
946 | inline void QUrlPrivate::appendFragment(QString &appendTo, QUrl::FormattingOptions options, Section appendingTo) const |
947 | { |
948 | appendToUser(appendTo, fragment, options, |
949 | options & QUrl::EncodeDelimiters ? fragmentInUrl : |
950 | appendingTo == FullUrl ? nullptr : fragmentInIsolation); |
951 | } |
952 | |
953 | inline void QUrlPrivate::appendQuery(QString &appendTo, QUrl::FormattingOptions options, Section appendingTo) const |
954 | { |
955 | appendToUser(appendTo, query, options, |
956 | appendingTo == FullUrl || options & QUrl::EncodeDelimiters ? queryInUrl : queryInIsolation); |
957 | } |
958 | |
959 | // setXXX functions |
960 | |
961 | inline bool QUrlPrivate::setScheme(const QString &value, int len, bool doSetError) |
962 | { |
963 | // schemes are strictly RFC-compliant: |
964 | // scheme = ALPHA *( ALPHA / DIGIT / "+" / "-" / "." ) |
965 | // we also lowercase the scheme |
966 | |
967 | // schemes in URLs are not allowed to be empty, but they can be in |
968 | // "Relative URIs" which QUrl also supports. QUrl::setScheme does |
969 | // not call us with len == 0, so this can only be from parse() |
970 | scheme.clear(); |
971 | if (len == 0) |
972 | return false; |
973 | |
974 | sectionIsPresent |= Scheme; |
975 | |
976 | // validate it: |
977 | int needsLowercasing = -1; |
978 | const ushort *p = value.utf16(); |
979 | for (int i = 0; i < len; ++i) { |
980 | if (p[i] >= 'a' && p[i] <= 'z') |
981 | continue; |
982 | if (p[i] >= 'A' && p[i] <= 'Z') { |
983 | needsLowercasing = i; |
984 | continue; |
985 | } |
986 | if (i) { |
987 | if (p[i] >= '0' && p[i] <= '9') |
988 | continue; |
989 | if (p[i] == '+' || p[i] == '-' || p[i] == '.') |
990 | continue; |
991 | } |
992 | |
993 | // found something else |
994 | // don't call setError needlessly: |
995 | // if we've been called from parse(), it will try to recover |
996 | if (doSetError) |
997 | setError(InvalidSchemeError, value, i); |
998 | return false; |
999 | } |
1000 | |
1001 | scheme = value.left(len); |
1002 | |
1003 | if (needsLowercasing != -1) { |
1004 | // schemes are ASCII only, so we don't need the full Unicode toLower |
1005 | QChar *schemeData = scheme.data(); // force detaching here |
1006 | for (int i = needsLowercasing; i >= 0; --i) { |
1007 | ushort c = schemeData[i].unicode(); |
1008 | if (c >= 'A' && c <= 'Z') |
1009 | schemeData[i] = QChar(c + 0x20); |
1010 | } |
1011 | } |
1012 | |
1013 | // did we set to the file protocol? |
1014 | if (scheme == fileScheme() |
1015 | #ifdef Q_OS_WIN |
1016 | || scheme == webDavScheme() |
1017 | #endif |
1018 | ) { |
1019 | flags |= IsLocalFile; |
1020 | } else { |
1021 | flags &= ~IsLocalFile; |
1022 | } |
1023 | return true; |
1024 | } |
1025 | |
1026 | inline void QUrlPrivate::setAuthority(const QString &auth, int from, int end, QUrl::ParsingMode mode) |
1027 | { |
1028 | sectionIsPresent &= ~Authority; |
1029 | sectionIsPresent |= Host; |
1030 | port = -1; |
1031 | |
1032 | // we never actually _loop_ |
1033 | while (from != end) { |
1034 | int userInfoIndex = auth.indexOf(QLatin1Char('@'), from); |
1035 | if (uint(userInfoIndex) < uint(end)) { |
1036 | setUserInfo(auth, from, userInfoIndex); |
1037 | if (mode == QUrl::StrictMode && !validateComponent(UserInfo, auth, from, userInfoIndex)) |
1038 | break; |
1039 | from = userInfoIndex + 1; |
1040 | } |
1041 | |
1042 | int colonIndex = auth.lastIndexOf(QLatin1Char(':'), end - 1); |
1043 | if (colonIndex < from) |
1044 | colonIndex = -1; |
1045 | |
1046 | if (uint(colonIndex) < uint(end)) { |
1047 | if (auth.at(from).unicode() == '[') { |
1048 | // check if colonIndex isn't inside the "[...]" part |
1049 | int closingBracket = auth.indexOf(QLatin1Char(']'), from); |
1050 | if (uint(closingBracket) > uint(colonIndex)) |
1051 | colonIndex = -1; |
1052 | } |
1053 | } |
1054 | |
1055 | if (uint(colonIndex) < uint(end) - 1) { |
1056 | // found a colon with digits after it |
1057 | unsigned long x = 0; |
1058 | for (int i = colonIndex + 1; i < end; ++i) { |
1059 | ushort c = auth.at(i).unicode(); |
1060 | if (c >= '0' && c <= '9') { |
1061 | x *= 10; |
1062 | x += c - '0'; |
1063 | } else { |
1064 | x = ulong(-1); // x != ushort(x) |
1065 | break; |
1066 | } |
1067 | } |
1068 | if (x == ushort(x)) { |
1069 | port = ushort(x); |
1070 | } else { |
1071 | setError(InvalidPortError, auth, colonIndex + 1); |
1072 | if (mode == QUrl::StrictMode) |
1073 | break; |
1074 | } |
1075 | } |
1076 | |
1077 | setHost(auth, from, qMin<uint>(end, colonIndex), mode); |
1078 | if (mode == QUrl::StrictMode && !validateComponent(Host, auth, from, qMin<uint>(end, colonIndex))) { |
1079 | // clear host too |
1080 | sectionIsPresent &= ~Authority; |
1081 | break; |
1082 | } |
1083 | |
1084 | // success |
1085 | return; |
1086 | } |
1087 | // clear all sections but host |
1088 | sectionIsPresent &= ~Authority | Host; |
1089 | userName.clear(); |
1090 | password.clear(); |
1091 | host.clear(); |
1092 | port = -1; |
1093 | } |
1094 | |
1095 | inline void QUrlPrivate::setUserInfo(const QString &userInfo, int from, int end) |
1096 | { |
1097 | int delimIndex = userInfo.indexOf(QLatin1Char(':'), from); |
1098 | setUserName(userInfo, from, qMin<uint>(delimIndex, end)); |
1099 | |
1100 | if (uint(delimIndex) >= uint(end)) { |
1101 | password.clear(); |
1102 | sectionIsPresent &= ~Password; |
1103 | } else { |
1104 | setPassword(userInfo, delimIndex + 1, end); |
1105 | } |
1106 | } |
1107 | |
1108 | inline void QUrlPrivate::setUserName(const QString &value, int from, int end) |
1109 | { |
1110 | sectionIsPresent |= UserName; |
1111 | userName = recodeFromUser(value, userNameInIsolation, from, end); |
1112 | } |
1113 | |
1114 | inline void QUrlPrivate::setPassword(const QString &value, int from, int end) |
1115 | { |
1116 | sectionIsPresent |= Password; |
1117 | password = recodeFromUser(value, passwordInIsolation, from, end); |
1118 | } |
1119 | |
1120 | inline void QUrlPrivate::setPath(const QString &value, int from, int end) |
1121 | { |
1122 | // sectionIsPresent |= Path; // not used, save some cycles |
1123 | path = recodeFromUser(value, pathInIsolation, from, end); |
1124 | } |
1125 | |
1126 | inline void QUrlPrivate::setFragment(const QString &value, int from, int end) |
1127 | { |
1128 | sectionIsPresent |= Fragment; |
1129 | fragment = recodeFromUser(value, fragmentInIsolation, from, end); |
1130 | } |
1131 | |
1132 | inline void QUrlPrivate::setQuery(const QString &value, int from, int iend) |
1133 | { |
1134 | sectionIsPresent |= Query; |
1135 | query = recodeFromUser(value, queryInIsolation, from, iend); |
1136 | } |
1137 | |
1138 | // Host handling |
1139 | // The RFC says the host is: |
1140 | // host = IP-literal / IPv4address / reg-name |
1141 | // IP-literal = "[" ( IPv6address / IPvFuture ) "]" |
1142 | // IPvFuture = "v" 1*HEXDIG "." 1*( unreserved / sub-delims / ":" ) |
1143 | // [a strict definition of IPv6Address and IPv4Address] |
1144 | // reg-name = *( unreserved / pct-encoded / sub-delims ) |
1145 | // |
1146 | // We deviate from the standard in all but IPvFuture. For IPvFuture we accept |
1147 | // and store only exactly what the RFC says we should. No percent-encoding is |
1148 | // permitted in this field, so Unicode characters and space aren't either. |
1149 | // |
1150 | // For IPv4 addresses, we accept broken addresses like inet_aton does (that is, |
1151 | // less than three dots). However, we correct the address to the proper form |
1152 | // and store the corrected address. After correction, we comply to the RFC and |
1153 | // it's exclusively composed of unreserved characters. |
1154 | // |
1155 | // For IPv6 addresses, we accept addresses including trailing (embedded) IPv4 |
1156 | // addresses, the so-called v4-compat and v4-mapped addresses. We also store |
1157 | // those addresses like that in the hostname field, which violates the spec. |
1158 | // IPv6 hosts are stored with the square brackets in the QString. It also |
1159 | // requires no transformation in any way. |
1160 | // |
1161 | // As for registered names, it's the other way around: we accept only valid |
1162 | // hostnames as specified by STD 3 and IDNA. That means everything we accept is |
1163 | // valid in the RFC definition above, but there are many valid reg-names |
1164 | // according to the RFC that we do not accept in the name of security. Since we |
1165 | // do accept IDNA, reg-names are subject to ACE encoding and decoding, which is |
1166 | // specified by the DecodeUnicode flag. The hostname is stored in its Unicode form. |
1167 | |
1168 | inline void QUrlPrivate::appendHost(QString &appendTo, QUrl::FormattingOptions options) const |
1169 | { |
1170 | if (host.isEmpty()) |
1171 | return; |
1172 | if (host.at(0).unicode() == '[') { |
1173 | // IPv6 addresses might contain a zone-id which needs to be recoded |
1174 | if (options != 0) |
1175 | if (qt_urlRecode(appendTo, host, options, nullptr)) |
1176 | return; |
1177 | appendTo += host; |
1178 | } else { |
1179 | // this is either an IPv4Address or a reg-name |
1180 | // if it is a reg-name, it is already stored in Unicode form |
1181 | if (options & QUrl::EncodeUnicode && !(options & 0x4000000)) |
1182 | appendTo += qt_ACE_do(host, ToAceOnly, AllowLeadingDot); |
1183 | else |
1184 | appendTo += host; |
1185 | } |
1186 | } |
1187 | |
1188 | // the whole IPvFuture is passed and parsed here, including brackets; |
1189 | // returns null if the parsing was successful, or the QChar of the first failure |
1190 | static const QChar *parseIpFuture(QString &host, const QChar *begin, const QChar *end, QUrl::ParsingMode mode) |
1191 | { |
1192 | // IPvFuture = "v" 1*HEXDIG "." 1*( unreserved / sub-delims / ":" ) |
1193 | static const char acceptable[] = |
1194 | "!$&'()*+,;=" // sub-delims |
1195 | ":" // ":" |
1196 | "-._~" ; // unreserved |
1197 | |
1198 | // the brackets and the "v" have been checked |
1199 | const QChar *const origBegin = begin; |
1200 | if (begin[3].unicode() != '.') |
1201 | return &begin[3]; |
1202 | if ((begin[2].unicode() >= 'A' && begin[2].unicode() <= 'F') || |
1203 | (begin[2].unicode() >= 'a' && begin[2].unicode() <= 'f') || |
1204 | (begin[2].unicode() >= '0' && begin[2].unicode() <= '9')) { |
1205 | // this is so unlikely that we'll just go down the slow path |
1206 | // decode the whole string, skipping the "[vH." and "]" which we already know to be there |
1207 | host += QStringView(begin, 4); |
1208 | |
1209 | // uppercase the version, if necessary |
1210 | if (begin[2].unicode() >= 'a') |
1211 | host[host.length() - 2] = QChar{begin[2].unicode() - 0x20}; |
1212 | |
1213 | begin += 4; |
1214 | --end; |
1215 | |
1216 | QString decoded; |
1217 | if (mode == QUrl::TolerantMode && qt_urlRecode(decoded, QStringView{begin, end}, QUrl::FullyDecoded, nullptr)) { |
1218 | begin = decoded.constBegin(); |
1219 | end = decoded.constEnd(); |
1220 | } |
1221 | |
1222 | for ( ; begin != end; ++begin) { |
1223 | if (begin->unicode() >= 'A' && begin->unicode() <= 'Z') |
1224 | host += *begin; |
1225 | else if (begin->unicode() >= 'a' && begin->unicode() <= 'z') |
1226 | host += *begin; |
1227 | else if (begin->unicode() >= '0' && begin->unicode() <= '9') |
1228 | host += *begin; |
1229 | else if (begin->unicode() < 0x80 && strchr(acceptable, begin->unicode()) != nullptr) |
1230 | host += *begin; |
1231 | else |
1232 | return decoded.isEmpty() ? begin : &origBegin[2]; |
1233 | } |
1234 | host += QLatin1Char(']'); |
1235 | return nullptr; |
1236 | } |
1237 | return &origBegin[2]; |
1238 | } |
1239 | |
1240 | // ONLY the IPv6 address is parsed here, WITHOUT the brackets |
1241 | static const QChar *parseIp6(QString &host, const QChar *begin, const QChar *end, QUrl::ParsingMode mode) |
1242 | { |
1243 | // ### Update to use QStringView once QStringView::indexOf and QStringView::lastIndexOf exists |
1244 | QString decoded; |
1245 | if (mode == QUrl::TolerantMode) { |
1246 | // this struct is kept in automatic storage because it's only 4 bytes |
1247 | const ushort decodeColon[] = { decode(':'), 0 }; |
1248 | if (qt_urlRecode(decoded, QStringView{begin, end}, QUrl::ComponentFormattingOption::PrettyDecoded, decodeColon) == 0) |
1249 | decoded = QString(begin, end-begin); |
1250 | } else { |
1251 | decoded = QString(begin, end-begin); |
1252 | } |
1253 | |
1254 | const QLatin1String zoneIdIdentifier("%25" ); |
1255 | QIPAddressUtils::IPv6Address address; |
1256 | QString zoneId; |
1257 | |
1258 | const QChar *endBeforeZoneId = decoded.constEnd(); |
1259 | |
1260 | int zoneIdPosition = decoded.indexOf(zoneIdIdentifier); |
1261 | if ((zoneIdPosition != -1) && (decoded.lastIndexOf(zoneIdIdentifier) == zoneIdPosition)) { |
1262 | zoneId = decoded.mid(zoneIdPosition + zoneIdIdentifier.size()); |
1263 | endBeforeZoneId = decoded.constBegin() + zoneIdPosition; |
1264 | |
1265 | if (zoneId.isEmpty()) |
1266 | return end; |
1267 | } |
1268 | |
1269 | const QChar *ret = QIPAddressUtils::parseIp6(address, decoded.constBegin(), endBeforeZoneId); |
1270 | if (ret) |
1271 | return begin + (ret - decoded.constBegin()); |
1272 | |
1273 | host.reserve(host.size() + (decoded.constEnd() - decoded.constBegin())); |
1274 | host += QLatin1Char('['); |
1275 | QIPAddressUtils::toString(host, address); |
1276 | |
1277 | if (!zoneId.isEmpty()) { |
1278 | host += zoneIdIdentifier; |
1279 | host += zoneId; |
1280 | } |
1281 | host += QLatin1Char(']'); |
1282 | return nullptr; |
1283 | } |
1284 | |
1285 | inline bool QUrlPrivate::setHost(const QString &value, int from, int iend, QUrl::ParsingMode mode) |
1286 | { |
1287 | const QChar *begin = value.constData() + from; |
1288 | const QChar *end = value.constData() + iend; |
1289 | |
1290 | const int len = end - begin; |
1291 | host.clear(); |
1292 | sectionIsPresent |= Host; |
1293 | if (len == 0) |
1294 | return true; |
1295 | |
1296 | if (begin[0].unicode() == '[') { |
1297 | // IPv6Address or IPvFuture |
1298 | // smallest IPv6 address is "[::]" (len = 4) |
1299 | // smallest IPvFuture address is "[v7.X]" (len = 6) |
1300 | if (end[-1].unicode() != ']') { |
1301 | setError(HostMissingEndBracket, value); |
1302 | return false; |
1303 | } |
1304 | |
1305 | if (len > 5 && begin[1].unicode() == 'v') { |
1306 | const QChar *c = parseIpFuture(host, begin, end, mode); |
1307 | if (c) |
1308 | setError(InvalidIPvFutureError, value, c - value.constData()); |
1309 | return !c; |
1310 | } else if (begin[1].unicode() == 'v') { |
1311 | setError(InvalidIPvFutureError, value, from); |
1312 | } |
1313 | |
1314 | const QChar *c = parseIp6(host, begin + 1, end - 1, mode); |
1315 | if (!c) |
1316 | return true; |
1317 | |
1318 | if (c == end - 1) |
1319 | setError(InvalidIPv6AddressError, value, from); |
1320 | else |
1321 | setError(InvalidCharacterInIPv6Error, value, c - value.constData()); |
1322 | return false; |
1323 | } |
1324 | |
1325 | // check if it's an IPv4 address |
1326 | QIPAddressUtils::IPv4Address ip4; |
1327 | if (QIPAddressUtils::parseIp4(ip4, begin, end)) { |
1328 | // yes, it was |
1329 | QIPAddressUtils::toString(host, ip4); |
1330 | return true; |
1331 | } |
1332 | |
1333 | // This is probably a reg-name. |
1334 | // But it can also be an encoded string that, when decoded becomes one |
1335 | // of the types above. |
1336 | // |
1337 | // Two types of encoding are possible: |
1338 | // percent encoding (e.g., "%31%30%2E%30%2E%30%2E%31" -> "10.0.0.1") |
1339 | // Unicode encoding (some non-ASCII characters case-fold to digits |
1340 | // when nameprepping is done) |
1341 | // |
1342 | // The qt_ACE_do function below applies nameprepping and the STD3 check. |
1343 | // That means a Unicode string may become an IPv4 address, but it cannot |
1344 | // produce a '[' or a '%'. |
1345 | |
1346 | // check for percent-encoding first |
1347 | QString s; |
1348 | if (mode == QUrl::TolerantMode && qt_urlRecode(s, QStringView{begin, end}, { }, nullptr)) { |
1349 | // something was decoded |
1350 | // anything encoded left? |
1351 | int pos = s.indexOf(QChar(0x25)); // '%' |
1352 | if (pos != -1) { |
1353 | setError(InvalidRegNameError, s, pos); |
1354 | return false; |
1355 | } |
1356 | |
1357 | // recurse |
1358 | return setHost(s, 0, s.length(), QUrl::StrictMode); |
1359 | } |
1360 | |
1361 | s = qt_ACE_do(QStringView(begin, len), NormalizeAce, ForbidLeadingDot); |
1362 | if (s.isEmpty()) { |
1363 | setError(InvalidRegNameError, value); |
1364 | return false; |
1365 | } |
1366 | |
1367 | // check IPv4 again |
1368 | if (QIPAddressUtils::parseIp4(ip4, s.constBegin(), s.constEnd())) { |
1369 | QIPAddressUtils::toString(host, ip4); |
1370 | } else { |
1371 | host = s; |
1372 | } |
1373 | return true; |
1374 | } |
1375 | |
1376 | inline void QUrlPrivate::parse(const QString &url, QUrl::ParsingMode parsingMode) |
1377 | { |
1378 | // URI-reference = URI / relative-ref |
1379 | // URI = scheme ":" hier-part [ "?" query ] [ "#" fragment ] |
1380 | // relative-ref = relative-part [ "?" query ] [ "#" fragment ] |
1381 | // hier-part = "//" authority path-abempty |
1382 | // / other path types |
1383 | // relative-part = "//" authority path-abempty |
1384 | // / other path types here |
1385 | |
1386 | sectionIsPresent = 0; |
1387 | flags = 0; |
1388 | clearError(); |
1389 | |
1390 | // find the important delimiters |
1391 | int colon = -1; |
1392 | int question = -1; |
1393 | int hash = -1; |
1394 | const int len = url.length(); |
1395 | const QChar *const begin = url.constData(); |
1396 | const ushort *const data = reinterpret_cast<const ushort *>(begin); |
1397 | |
1398 | for (int i = 0; i < len; ++i) { |
1399 | uint uc = data[i]; |
1400 | if (uc == '#' && hash == -1) { |
1401 | hash = i; |
1402 | |
1403 | // nothing more to be found |
1404 | break; |
1405 | } |
1406 | |
1407 | if (question == -1) { |
1408 | if (uc == ':' && colon == -1) |
1409 | colon = i; |
1410 | else if (uc == '?') |
1411 | question = i; |
1412 | } |
1413 | } |
1414 | |
1415 | // check if we have a scheme |
1416 | int hierStart; |
1417 | if (colon != -1 && setScheme(url, colon, /* don't set error */ false)) { |
1418 | hierStart = colon + 1; |
1419 | } else { |
1420 | // recover from a failed scheme: it might not have been a scheme at all |
1421 | scheme.clear(); |
1422 | sectionIsPresent = 0; |
1423 | hierStart = 0; |
1424 | } |
1425 | |
1426 | int pathStart; |
1427 | int hierEnd = qMin<uint>(qMin<uint>(question, hash), len); |
1428 | if (hierEnd - hierStart >= 2 && data[hierStart] == '/' && data[hierStart + 1] == '/') { |
1429 | // we have an authority, it ends at the first slash after these |
1430 | int authorityEnd = hierEnd; |
1431 | for (int i = hierStart + 2; i < authorityEnd ; ++i) { |
1432 | if (data[i] == '/') { |
1433 | authorityEnd = i; |
1434 | break; |
1435 | } |
1436 | } |
1437 | |
1438 | setAuthority(url, hierStart + 2, authorityEnd, parsingMode); |
1439 | |
1440 | // even if we failed to set the authority properly, let's try to recover |
1441 | pathStart = authorityEnd; |
1442 | setPath(url, pathStart, hierEnd); |
1443 | } else { |
1444 | userName.clear(); |
1445 | password.clear(); |
1446 | host.clear(); |
1447 | port = -1; |
1448 | pathStart = hierStart; |
1449 | |
1450 | if (hierStart < hierEnd) |
1451 | setPath(url, hierStart, hierEnd); |
1452 | else |
1453 | path.clear(); |
1454 | } |
1455 | |
1456 | if (uint(question) < uint(hash)) |
1457 | setQuery(url, question + 1, qMin<uint>(hash, len)); |
1458 | |
1459 | if (hash != -1) |
1460 | setFragment(url, hash + 1, len); |
1461 | |
1462 | if (error || parsingMode == QUrl::TolerantMode) |
1463 | return; |
1464 | |
1465 | // The parsing so far was partially tolerant of errors, except for the |
1466 | // scheme parser (which is always strict) and the authority (which was |
1467 | // executed in strict mode). |
1468 | // If we haven't found any errors so far, continue the strict-mode parsing |
1469 | // from the path component onwards. |
1470 | |
1471 | if (!validateComponent(Path, url, pathStart, hierEnd)) |
1472 | return; |
1473 | if (uint(question) < uint(hash) && !validateComponent(Query, url, question + 1, qMin<uint>(hash, len))) |
1474 | return; |
1475 | if (hash != -1) |
1476 | validateComponent(Fragment, url, hash + 1, len); |
1477 | } |
1478 | |
1479 | QString QUrlPrivate::toLocalFile(QUrl::FormattingOptions options) const |
1480 | { |
1481 | QString tmp; |
1482 | QString ourPath; |
1483 | appendPath(ourPath, options, QUrlPrivate::Path); |
1484 | |
1485 | // magic for shared drive on windows |
1486 | if (!host.isEmpty()) { |
1487 | tmp = QLatin1String("//" ) + host; |
1488 | #ifdef Q_OS_WIN // QTBUG-42346, WebDAV is visible as local file on Windows only. |
1489 | if (scheme == webDavScheme()) |
1490 | tmp += webDavSslTag(); |
1491 | #endif |
1492 | if (!ourPath.isEmpty() && !ourPath.startsWith(QLatin1Char('/'))) |
1493 | tmp += QLatin1Char('/'); |
1494 | tmp += ourPath; |
1495 | } else { |
1496 | tmp = ourPath; |
1497 | #ifdef Q_OS_WIN |
1498 | // magic for drives on windows |
1499 | if (ourPath.length() > 2 && ourPath.at(0) == QLatin1Char('/') && ourPath.at(2) == QLatin1Char(':')) |
1500 | tmp.remove(0, 1); |
1501 | #endif |
1502 | } |
1503 | return tmp; |
1504 | } |
1505 | |
1506 | /* |
1507 | From http://www.ietf.org/rfc/rfc3986.txt, 5.2.3: Merge paths |
1508 | |
1509 | Returns a merge of the current path with the relative path passed |
1510 | as argument. |
1511 | |
1512 | Note: \a relativePath is relative (does not start with '/'). |
1513 | */ |
1514 | inline QString QUrlPrivate::mergePaths(const QString &relativePath) const |
1515 | { |
1516 | // If the base URI has a defined authority component and an empty |
1517 | // path, then return a string consisting of "/" concatenated with |
1518 | // the reference's path; otherwise, |
1519 | if (!host.isEmpty() && path.isEmpty()) |
1520 | return QLatin1Char('/') + relativePath; |
1521 | |
1522 | // Return a string consisting of the reference's path component |
1523 | // appended to all but the last segment of the base URI's path |
1524 | // (i.e., excluding any characters after the right-most "/" in the |
1525 | // base URI path, or excluding the entire base URI path if it does |
1526 | // not contain any "/" characters). |
1527 | QString newPath; |
1528 | if (!path.contains(QLatin1Char('/'))) |
1529 | newPath = relativePath; |
1530 | else |
1531 | newPath = QStringView{path}.left(path.lastIndexOf(QLatin1Char('/')) + 1) + relativePath; |
1532 | |
1533 | return newPath; |
1534 | } |
1535 | |
1536 | /* |
1537 | From http://www.ietf.org/rfc/rfc3986.txt, 5.2.4: Remove dot segments |
1538 | |
1539 | Removes unnecessary ../ and ./ from the path. Used for normalizing |
1540 | the URL. |
1541 | */ |
1542 | static void removeDotsFromPath(QString *path) |
1543 | { |
1544 | // The input buffer is initialized with the now-appended path |
1545 | // components and the output buffer is initialized to the empty |
1546 | // string. |
1547 | QChar *out = path->data(); |
1548 | const QChar *in = out; |
1549 | const QChar *end = out + path->size(); |
1550 | |
1551 | // If the input buffer consists only of |
1552 | // "." or "..", then remove that from the input |
1553 | // buffer; |
1554 | if (path->size() == 1 && in[0].unicode() == '.') |
1555 | ++in; |
1556 | else if (path->size() == 2 && in[0].unicode() == '.' && in[1].unicode() == '.') |
1557 | in += 2; |
1558 | // While the input buffer is not empty, loop: |
1559 | while (in < end) { |
1560 | |
1561 | // otherwise, if the input buffer begins with a prefix of "../" or "./", |
1562 | // then remove that prefix from the input buffer; |
1563 | if (path->size() >= 2 && in[0].unicode() == '.' && in[1].unicode() == '/') |
1564 | in += 2; |
1565 | else if (path->size() >= 3 && in[0].unicode() == '.' |
1566 | && in[1].unicode() == '.' && in[2].unicode() == '/') |
1567 | in += 3; |
1568 | |
1569 | // otherwise, if the input buffer begins with a prefix of |
1570 | // "/./" or "/.", where "." is a complete path segment, |
1571 | // then replace that prefix with "/" in the input buffer; |
1572 | if (in <= end - 3 && in[0].unicode() == '/' && in[1].unicode() == '.' |
1573 | && in[2].unicode() == '/') { |
1574 | in += 2; |
1575 | continue; |
1576 | } else if (in == end - 2 && in[0].unicode() == '/' && in[1].unicode() == '.') { |
1577 | *out++ = QLatin1Char('/'); |
1578 | in += 2; |
1579 | break; |
1580 | } |
1581 | |
1582 | // otherwise, if the input buffer begins with a prefix |
1583 | // of "/../" or "/..", where ".." is a complete path |
1584 | // segment, then replace that prefix with "/" in the |
1585 | // input buffer and remove the last //segment and its |
1586 | // preceding "/" (if any) from the output buffer; |
1587 | if (in <= end - 4 && in[0].unicode() == '/' && in[1].unicode() == '.' |
1588 | && in[2].unicode() == '.' && in[3].unicode() == '/') { |
1589 | while (out > path->constData() && (--out)->unicode() != '/') |
1590 | ; |
1591 | if (out == path->constData() && out->unicode() != '/') |
1592 | ++in; |
1593 | in += 3; |
1594 | continue; |
1595 | } else if (in == end - 3 && in[0].unicode() == '/' && in[1].unicode() == '.' |
1596 | && in[2].unicode() == '.') { |
1597 | while (out > path->constData() && (--out)->unicode() != '/') |
1598 | ; |
1599 | if (out->unicode() == '/') |
1600 | ++out; |
1601 | in += 3; |
1602 | break; |
1603 | } |
1604 | |
1605 | // otherwise move the first path segment in |
1606 | // the input buffer to the end of the output |
1607 | // buffer, including the initial "/" character |
1608 | // (if any) and any subsequent characters up |
1609 | // to, but not including, the next "/" |
1610 | // character or the end of the input buffer. |
1611 | *out++ = *in++; |
1612 | while (in < end && in->unicode() != '/') |
1613 | *out++ = *in++; |
1614 | } |
1615 | path->truncate(out - path->constData()); |
1616 | } |
1617 | |
1618 | inline QUrlPrivate::ErrorCode QUrlPrivate::validityError(QString *source, int *position) const |
1619 | { |
1620 | Q_ASSERT(!source == !position); |
1621 | if (error) { |
1622 | if (source) { |
1623 | *source = error->source; |
1624 | *position = error->position; |
1625 | } |
1626 | return error->code; |
1627 | } |
1628 | |
1629 | // There are three more cases of invalid URLs that QUrl recognizes and they |
1630 | // are only possible with constructed URLs (setXXX methods), not with |
1631 | // parsing. Therefore, they are tested here. |
1632 | // |
1633 | // Two cases are a non-empty path that doesn't start with a slash and: |
1634 | // - with an authority |
1635 | // - without an authority, without scheme but the path with a colon before |
1636 | // the first slash |
1637 | // The third case is an empty authority and a non-empty path that starts |
1638 | // with "//". |
1639 | // Those cases are considered invalid because toString() would produce a URL |
1640 | // that wouldn't be parsed back to the same QUrl. |
1641 | |
1642 | if (path.isEmpty()) |
1643 | return NoError; |
1644 | if (path.at(0) == QLatin1Char('/')) { |
1645 | if (hasAuthority() || path.length() == 1 || path.at(1) != QLatin1Char('/')) |
1646 | return NoError; |
1647 | if (source) { |
1648 | *source = path; |
1649 | *position = 0; |
1650 | } |
1651 | return AuthorityAbsentAndPathIsDoubleSlash; |
1652 | } |
1653 | |
1654 | if (sectionIsPresent & QUrlPrivate::Host) { |
1655 | if (source) { |
1656 | *source = path; |
1657 | *position = 0; |
1658 | } |
1659 | return AuthorityPresentAndPathIsRelative; |
1660 | } |
1661 | if (sectionIsPresent & QUrlPrivate::Scheme) |
1662 | return NoError; |
1663 | |
1664 | // check for a path of "text:text/" |
1665 | for (int i = 0; i < path.length(); ++i) { |
1666 | ushort c = path.at(i).unicode(); |
1667 | if (c == '/') { |
1668 | // found the slash before the colon |
1669 | return NoError; |
1670 | } |
1671 | if (c == ':') { |
1672 | // found the colon before the slash, it's invalid |
1673 | if (source) { |
1674 | *source = path; |
1675 | *position = i; |
1676 | } |
1677 | return RelativeUrlPathContainsColonBeforeSlash; |
1678 | } |
1679 | } |
1680 | return NoError; |
1681 | } |
1682 | |
1683 | bool QUrlPrivate::validateComponent(QUrlPrivate::Section section, const QString &input, |
1684 | int begin, int end) |
1685 | { |
1686 | // What we need to look out for, that the regular parser tolerates: |
1687 | // - percent signs not followed by two hex digits |
1688 | // - forbidden characters, which should always appear encoded |
1689 | // '"' / '<' / '>' / '\' / '^' / '`' / '{' / '|' / '}' / BKSP |
1690 | // control characters |
1691 | // - delimiters not allowed in certain positions |
1692 | // . scheme: parser is already strict |
1693 | // . user info: gen-delims except ":" disallowed ("/" / "?" / "#" / "[" / "]" / "@") |
1694 | // . host: parser is stricter than the standard |
1695 | // . port: parser is stricter than the standard |
1696 | // . path: all delimiters allowed |
1697 | // . fragment: all delimiters allowed |
1698 | // . query: all delimiters allowed |
1699 | static const char forbidden[] = "\"<>\\^`{|}\x7F" ; |
1700 | static const char forbiddenUserInfo[] = ":/?#[]@" ; |
1701 | |
1702 | Q_ASSERT(section != Authority && section != Hierarchy && section != FullUrl); |
1703 | |
1704 | const ushort *const data = reinterpret_cast<const ushort *>(input.constData()); |
1705 | for (uint i = uint(begin); i < uint(end); ++i) { |
1706 | uint uc = data[i]; |
1707 | if (uc >= 0x80) |
1708 | continue; |
1709 | |
1710 | bool error = false; |
1711 | if ((uc == '%' && (uint(end) < i + 2 || !isHex(data[i + 1]) || !isHex(data[i + 2]))) |
1712 | || uc <= 0x20 || strchr(forbidden, uc)) { |
1713 | // found an error |
1714 | error = true; |
1715 | } else if (section & UserInfo) { |
1716 | if (section == UserInfo && strchr(forbiddenUserInfo + 1, uc)) |
1717 | error = true; |
1718 | else if (section != UserInfo && strchr(forbiddenUserInfo, uc)) |
1719 | error = true; |
1720 | } |
1721 | |
1722 | if (!error) |
1723 | continue; |
1724 | |
1725 | ErrorCode errorCode = ErrorCode(int(section) << 8); |
1726 | if (section == UserInfo) { |
1727 | // is it the user name or the password? |
1728 | errorCode = InvalidUserNameError; |
1729 | for (uint j = uint(begin); j < i; ++j) |
1730 | if (data[j] == ':') { |
1731 | errorCode = InvalidPasswordError; |
1732 | break; |
1733 | } |
1734 | } |
1735 | |
1736 | setError(errorCode, input, i); |
1737 | return false; |
1738 | } |
1739 | |
1740 | // no errors |
1741 | return true; |
1742 | } |
1743 | |
1744 | #if 0 |
1745 | inline void QUrlPrivate::validate() const |
1746 | { |
1747 | QUrlPrivate *that = (QUrlPrivate *)this; |
1748 | that->encodedOriginal = that->toEncoded(); // may detach |
1749 | parse(ParseOnly); |
1750 | |
1751 | QURL_SETFLAG(that->stateFlags, Validated); |
1752 | |
1753 | if (!isValid) |
1754 | return; |
1755 | |
1756 | QString auth = authority(); // causes the non-encoded forms to be valid |
1757 | |
1758 | // authority() calls canonicalHost() which sets this |
1759 | if (!isHostValid) |
1760 | return; |
1761 | |
1762 | if (scheme == QLatin1String("mailto" )) { |
1763 | if (!host.isEmpty() || port != -1 || !userName.isEmpty() || !password.isEmpty()) { |
1764 | that->isValid = false; |
1765 | that->errorInfo.setParams(0, QT_TRANSLATE_NOOP(QUrl, "expected empty host, username," |
1766 | "port and password" ), |
1767 | 0, 0); |
1768 | } |
1769 | } else if (scheme == ftpScheme() || scheme == httpScheme()) { |
1770 | if (host.isEmpty() && !(path.isEmpty() && encodedPath.isEmpty())) { |
1771 | that->isValid = false; |
1772 | that->errorInfo.setParams(0, QT_TRANSLATE_NOOP(QUrl, "the host is empty, but not the path" ), |
1773 | 0, 0); |
1774 | } |
1775 | } |
1776 | } |
1777 | #endif |
1778 | |
1779 | /*! |
1780 | \macro QT_NO_URL_CAST_FROM_STRING |
1781 | \relates QUrl |
1782 | |
1783 | Disables automatic conversions from QString (or char *) to QUrl. |
1784 | |
1785 | Compiling your code with this define is useful when you have a lot of |
1786 | code that uses QString for file names and you wish to convert it to |
1787 | use QUrl for network transparency. In any code that uses QUrl, it can |
1788 | help avoid missing QUrl::resolved() calls, and other misuses of |
1789 | QString to QUrl conversions. |
1790 | |
1791 | \oldcode |
1792 | url = filename; // probably not what you want |
1793 | \newcode |
1794 | url = QUrl::fromLocalFile(filename); |
1795 | url = baseurl.resolved(QUrl(filename)); |
1796 | \endcode |
1797 | |
1798 | \sa QT_NO_CAST_FROM_ASCII |
1799 | */ |
1800 | |
1801 | |
1802 | /*! |
1803 | Constructs a URL by parsing \a url. QUrl will automatically percent encode |
1804 | all characters that are not allowed in a URL and decode the percent-encoded |
1805 | sequences that represent an unreserved character (letters, digits, hyphens, |
1806 | undercores, dots and tildes). All other characters are left in their |
1807 | original forms. |
1808 | |
1809 | Parses the \a url using the parser mode \a parsingMode. In TolerantMode |
1810 | (the default), QUrl will correct certain mistakes, notably the presence of |
1811 | a percent character ('%') not followed by two hexadecimal digits, and it |
1812 | will accept any character in any position. In StrictMode, encoding mistakes |
1813 | will not be tolerated and QUrl will also check that certain forbidden |
1814 | characters are not present in unencoded form. If an error is detected in |
1815 | StrictMode, isValid() will return false. The parsing mode DecodedMode is not |
1816 | permitted in this context. |
1817 | |
1818 | Example: |
1819 | |
1820 | \snippet code/src_corelib_io_qurl.cpp 0 |
1821 | |
1822 | To construct a URL from an encoded string, you can also use fromEncoded(): |
1823 | |
1824 | \snippet code/src_corelib_io_qurl.cpp 1 |
1825 | |
1826 | Both functions are equivalent and, in Qt 5, both functions accept encoded |
1827 | data. Usually, the choice of the QUrl constructor or setUrl() versus |
1828 | fromEncoded() will depend on the source data: the constructor and setUrl() |
1829 | take a QString, whereas fromEncoded takes a QByteArray. |
1830 | |
1831 | \sa setUrl(), fromEncoded(), TolerantMode |
1832 | */ |
1833 | QUrl::QUrl(const QString &url, ParsingMode parsingMode) : d(nullptr) |
1834 | { |
1835 | setUrl(url, parsingMode); |
1836 | } |
1837 | |
1838 | /*! |
1839 | Constructs an empty QUrl object. |
1840 | */ |
1841 | QUrl::QUrl() : d(nullptr) |
1842 | { |
1843 | } |
1844 | |
1845 | /*! |
1846 | Constructs a copy of \a other. |
1847 | */ |
1848 | QUrl::QUrl(const QUrl &other) : d(other.d) |
1849 | { |
1850 | if (d) |
1851 | d->ref.ref(); |
1852 | } |
1853 | |
1854 | /*! |
1855 | Destructor; called immediately before the object is deleted. |
1856 | */ |
1857 | QUrl::~QUrl() |
1858 | { |
1859 | if (d && !d->ref.deref()) |
1860 | delete d; |
1861 | } |
1862 | |
1863 | /*! |
1864 | Returns \c true if the URL is non-empty and valid; otherwise returns \c false. |
1865 | |
1866 | The URL is run through a conformance test. Every part of the URL |
1867 | must conform to the standard encoding rules of the URI standard |
1868 | for the URL to be reported as valid. |
1869 | |
1870 | \snippet code/src_corelib_io_qurl.cpp 2 |
1871 | */ |
1872 | bool QUrl::isValid() const |
1873 | { |
1874 | if (isEmpty()) { |
1875 | // also catches d == nullptr |
1876 | return false; |
1877 | } |
1878 | return d->validityError() == QUrlPrivate::NoError; |
1879 | } |
1880 | |
1881 | /*! |
1882 | Returns \c true if the URL has no data; otherwise returns \c false. |
1883 | |
1884 | \sa clear() |
1885 | */ |
1886 | bool QUrl::isEmpty() const |
1887 | { |
1888 | if (!d) return true; |
1889 | return d->isEmpty(); |
1890 | } |
1891 | |
1892 | /*! |
1893 | Resets the content of the QUrl. After calling this function, the |
1894 | QUrl is equal to one that has been constructed with the default |
1895 | empty constructor. |
1896 | |
1897 | \sa isEmpty() |
1898 | */ |
1899 | void QUrl::clear() |
1900 | { |
1901 | if (d && !d->ref.deref()) |
1902 | delete d; |
1903 | d = nullptr; |
1904 | } |
1905 | |
1906 | /*! |
1907 | Parses \a url and sets this object to that value. QUrl will automatically |
1908 | percent encode all characters that are not allowed in a URL and decode the |
1909 | percent-encoded sequences that represent an unreserved character (letters, |
1910 | digits, hyphens, undercores, dots and tildes). All other characters are |
1911 | left in their original forms. |
1912 | |
1913 | Parses the \a url using the parser mode \a parsingMode. In TolerantMode |
1914 | (the default), QUrl will correct certain mistakes, notably the presence of |
1915 | a percent character ('%') not followed by two hexadecimal digits, and it |
1916 | will accept any character in any position. In StrictMode, encoding mistakes |
1917 | will not be tolerated and QUrl will also check that certain forbidden |
1918 | characters are not present in unencoded form. If an error is detected in |
1919 | StrictMode, isValid() will return false. The parsing mode DecodedMode is |
1920 | not permitted in this context and will produce a run-time warning. |
1921 | |
1922 | \sa url(), toString() |
1923 | */ |
1924 | void QUrl::setUrl(const QString &url, ParsingMode parsingMode) |
1925 | { |
1926 | if (parsingMode == DecodedMode) { |
1927 | qWarning("QUrl: QUrl::DecodedMode is not permitted when parsing a full URL" ); |
1928 | } else { |
1929 | detach(); |
1930 | d->parse(url, parsingMode); |
1931 | } |
1932 | } |
1933 | |
1934 | /*! |
1935 | Sets the scheme of the URL to \a scheme. As a scheme can only |
1936 | contain ASCII characters, no conversion or decoding is done on the |
1937 | input. It must also start with an ASCII letter. |
1938 | |
1939 | The scheme describes the type (or protocol) of the URL. It's |
1940 | represented by one or more ASCII characters at the start the URL. |
1941 | |
1942 | A scheme is strictly \l {http://www.ietf.org/rfc/rfc3986.txt} {RFC 3986}-compliant: |
1943 | \tt {scheme = ALPHA *( ALPHA / DIGIT / "+" / "-" / "." )} |
1944 | |
1945 | The following example shows a URL where the scheme is "ftp": |
1946 | |
1947 | \image qurl-authority2.png |
1948 | |
1949 | To set the scheme, the following call is used: |
1950 | \snippet code/src_corelib_io_qurl.cpp 11 |
1951 | |
1952 | The scheme can also be empty, in which case the URL is interpreted |
1953 | as relative. |
1954 | |
1955 | \sa scheme(), isRelative() |
1956 | */ |
1957 | void QUrl::setScheme(const QString &scheme) |
1958 | { |
1959 | detach(); |
1960 | d->clearError(); |
1961 | if (scheme.isEmpty()) { |
1962 | // schemes are not allowed to be empty |
1963 | d->sectionIsPresent &= ~QUrlPrivate::Scheme; |
1964 | d->flags &= ~QUrlPrivate::IsLocalFile; |
1965 | d->scheme.clear(); |
1966 | } else { |
1967 | d->setScheme(scheme, scheme.length(), /* do set error */ true); |
1968 | } |
1969 | } |
1970 | |
1971 | /*! |
1972 | Returns the scheme of the URL. If an empty string is returned, |
1973 | this means the scheme is undefined and the URL is then relative. |
1974 | |
1975 | The scheme can only contain US-ASCII letters or digits, which means it |
1976 | cannot contain any character that would otherwise require encoding. |
1977 | Additionally, schemes are always returned in lowercase form. |
1978 | |
1979 | \sa setScheme(), isRelative() |
1980 | */ |
1981 | QString QUrl::scheme() const |
1982 | { |
1983 | if (!d) return QString(); |
1984 | |
1985 | return d->scheme; |
1986 | } |
1987 | |
1988 | /*! |
1989 | Sets the authority of the URL to \a authority. |
1990 | |
1991 | The authority of a URL is the combination of user info, a host |
1992 | name and a port. All of these elements are optional; an empty |
1993 | authority is therefore valid. |
1994 | |
1995 | The user info and host are separated by a '@', and the host and |
1996 | port are separated by a ':'. If the user info is empty, the '@' |
1997 | must be omitted; although a stray ':' is permitted if the port is |
1998 | empty. |
1999 | |
2000 | The following example shows a valid authority string: |
2001 | |
2002 | \image qurl-authority.png |
2003 | |
2004 | The \a authority data is interpreted according to \a mode: in StrictMode, |
2005 | any '%' characters must be followed by exactly two hexadecimal characters |
2006 | and some characters (including space) are not allowed in undecoded form. In |
2007 | TolerantMode (the default), all characters are accepted in undecoded form |
2008 | and the tolerant parser will correct stray '%' not followed by two hex |
2009 | characters. |
2010 | |
2011 | This function does not allow \a mode to be QUrl::DecodedMode. To set fully |
2012 | decoded data, call setUserName(), setPassword(), setHost() and setPort() |
2013 | individually. |
2014 | |
2015 | \sa setUserInfo(), setHost(), setPort() |
2016 | */ |
2017 | void QUrl::setAuthority(const QString &authority, ParsingMode mode) |
2018 | { |
2019 | detach(); |
2020 | d->clearError(); |
2021 | |
2022 | if (mode == DecodedMode) { |
2023 | qWarning("QUrl::setAuthority(): QUrl::DecodedMode is not permitted in this function" ); |
2024 | return; |
2025 | } |
2026 | |
2027 | d->setAuthority(authority, 0, authority.length(), mode); |
2028 | if (authority.isNull()) { |
2029 | // QUrlPrivate::setAuthority cleared almost everything |
2030 | // but it leaves the Host bit set |
2031 | d->sectionIsPresent &= ~QUrlPrivate::Authority; |
2032 | } |
2033 | } |
2034 | |
2035 | /*! |
2036 | Returns the authority of the URL if it is defined; otherwise |
2037 | an empty string is returned. |
2038 | |
2039 | This function returns an unambiguous value, which may contain that |
2040 | characters still percent-encoded, plus some control sequences not |
2041 | representable in decoded form in QString. |
2042 | |
2043 | The \a options argument controls how to format the user info component. The |
2044 | value of QUrl::FullyDecoded is not permitted in this function. If you need |
2045 | to obtain fully decoded data, call userName(), password(), host() and |
2046 | port() individually. |
2047 | |
2048 | \sa setAuthority(), userInfo(), userName(), password(), host(), port() |
2049 | */ |
2050 | QString QUrl::authority(ComponentFormattingOptions options) const |
2051 | { |
2052 | QString result; |
2053 | if (!d) |
2054 | return result; |
2055 | |
2056 | if (options == QUrl::FullyDecoded) { |
2057 | qWarning("QUrl::authority(): QUrl::FullyDecoded is not permitted in this function" ); |
2058 | return result; |
2059 | } |
2060 | |
2061 | d->appendAuthority(result, options, QUrlPrivate::Authority); |
2062 | return result; |
2063 | } |
2064 | |
2065 | /*! |
2066 | Sets the user info of the URL to \a userInfo. The user info is an |
2067 | optional part of the authority of the URL, as described in |
2068 | setAuthority(). |
2069 | |
2070 | The user info consists of a user name and optionally a password, |
2071 | separated by a ':'. If the password is empty, the colon must be |
2072 | omitted. The following example shows a valid user info string: |
2073 | |
2074 | \image qurl-authority3.png |
2075 | |
2076 | The \a userInfo data is interpreted according to \a mode: in StrictMode, |
2077 | any '%' characters must be followed by exactly two hexadecimal characters |
2078 | and some characters (including space) are not allowed in undecoded form. In |
2079 | TolerantMode (the default), all characters are accepted in undecoded form |
2080 | and the tolerant parser will correct stray '%' not followed by two hex |
2081 | characters. |
2082 | |
2083 | This function does not allow \a mode to be QUrl::DecodedMode. To set fully |
2084 | decoded data, call setUserName() and setPassword() individually. |
2085 | |
2086 | \sa userInfo(), setUserName(), setPassword(), setAuthority() |
2087 | */ |
2088 | void QUrl::setUserInfo(const QString &userInfo, ParsingMode mode) |
2089 | { |
2090 | detach(); |
2091 | d->clearError(); |
2092 | QString trimmed = userInfo.trimmed(); |
2093 | if (mode == DecodedMode) { |
2094 | qWarning("QUrl::setUserInfo(): QUrl::DecodedMode is not permitted in this function" ); |
2095 | return; |
2096 | } |
2097 | |
2098 | d->setUserInfo(trimmed, 0, trimmed.length()); |
2099 | if (userInfo.isNull()) { |
2100 | // QUrlPrivate::setUserInfo cleared almost everything |
2101 | // but it leaves the UserName bit set |
2102 | d->sectionIsPresent &= ~QUrlPrivate::UserInfo; |
2103 | } else if (mode == StrictMode && !d->validateComponent(QUrlPrivate::UserInfo, userInfo)) { |
2104 | d->sectionIsPresent &= ~QUrlPrivate::UserInfo; |
2105 | d->userName.clear(); |
2106 | d->password.clear(); |
2107 | } |
2108 | } |
2109 | |
2110 | /*! |
2111 | Returns the user info of the URL, or an empty string if the user |
2112 | info is undefined. |
2113 | |
2114 | This function returns an unambiguous value, which may contain that |
2115 | characters still percent-encoded, plus some control sequences not |
2116 | representable in decoded form in QString. |
2117 | |
2118 | The \a options argument controls how to format the user info component. The |
2119 | value of QUrl::FullyDecoded is not permitted in this function. If you need |
2120 | to obtain fully decoded data, call userName() and password() individually. |
2121 | |
2122 | \sa setUserInfo(), userName(), password(), authority() |
2123 | */ |
2124 | QString QUrl::userInfo(ComponentFormattingOptions options) const |
2125 | { |
2126 | QString result; |
2127 | if (!d) |
2128 | return result; |
2129 | |
2130 | if (options == QUrl::FullyDecoded) { |
2131 | qWarning("QUrl::userInfo(): QUrl::FullyDecoded is not permitted in this function" ); |
2132 | return result; |
2133 | } |
2134 | |
2135 | d->appendUserInfo(result, options, QUrlPrivate::UserInfo); |
2136 | return result; |
2137 | } |
2138 | |
2139 | /*! |
2140 | Sets the URL's user name to \a userName. The \a userName is part |
2141 | of the user info element in the authority of the URL, as described |
2142 | in setUserInfo(). |
2143 | |
2144 | The \a userName data is interpreted according to \a mode: in StrictMode, |
2145 | any '%' characters must be followed by exactly two hexadecimal characters |
2146 | and some characters (including space) are not allowed in undecoded form. In |
2147 | TolerantMode (the default), all characters are accepted in undecoded form |
2148 | and the tolerant parser will correct stray '%' not followed by two hex |
2149 | characters. In DecodedMode, '%' stand for themselves and encoded characters |
2150 | are not possible. |
2151 | |
2152 | QUrl::DecodedMode should be used when setting the user name from a data |
2153 | source which is not a URL, such as a password dialog shown to the user or |
2154 | with a user name obtained by calling userName() with the QUrl::FullyDecoded |
2155 | formatting option. |
2156 | |
2157 | \sa userName(), setUserInfo() |
2158 | */ |
2159 | void QUrl::setUserName(const QString &userName, ParsingMode mode) |
2160 | { |
2161 | detach(); |
2162 | d->clearError(); |
2163 | |
2164 | QString data = userName; |
2165 | if (mode == DecodedMode) { |
2166 | parseDecodedComponent(data); |
2167 | mode = TolerantMode; |
2168 | } |
2169 | |
2170 | d->setUserName(data, 0, data.length()); |
2171 | if (userName.isNull()) |
2172 | d->sectionIsPresent &= ~QUrlPrivate::UserName; |
2173 | else if (mode == StrictMode && !d->validateComponent(QUrlPrivate::UserName, userName)) |
2174 | d->userName.clear(); |
2175 | } |
2176 | |
2177 | /*! |
2178 | Returns the user name of the URL if it is defined; otherwise |
2179 | an empty string is returned. |
2180 | |
2181 | The \a options argument controls how to format the user name component. All |
2182 | values produce an unambiguous result. With QUrl::FullyDecoded, all |
2183 | percent-encoded sequences are decoded; otherwise, the returned value may |
2184 | contain some percent-encoded sequences for some control sequences not |
2185 | representable in decoded form in QString. |
2186 | |
2187 | Note that QUrl::FullyDecoded may cause data loss if those non-representable |
2188 | sequences are present. It is recommended to use that value when the result |
2189 | will be used in a non-URL context, such as setting in QAuthenticator or |
2190 | negotiating a login. |
2191 | |
2192 | \sa setUserName(), userInfo() |
2193 | */ |
2194 | QString QUrl::userName(ComponentFormattingOptions options) const |
2195 | { |
2196 | QString result; |
2197 | if (d) |
2198 | d->appendUserName(result, options); |
2199 | return result; |
2200 | } |
2201 | |
2202 | /*! |
2203 | Sets the URL's password to \a password. The \a password is part of |
2204 | the user info element in the authority of the URL, as described in |
2205 | setUserInfo(). |
2206 | |
2207 | The \a password data is interpreted according to \a mode: in StrictMode, |
2208 | any '%' characters must be followed by exactly two hexadecimal characters |
2209 | and some characters (including space) are not allowed in undecoded form. In |
2210 | TolerantMode, all characters are accepted in undecoded form and the |
2211 | tolerant parser will correct stray '%' not followed by two hex characters. |
2212 | In DecodedMode, '%' stand for themselves and encoded characters are not |
2213 | possible. |
2214 | |
2215 | QUrl::DecodedMode should be used when setting the password from a data |
2216 | source which is not a URL, such as a password dialog shown to the user or |
2217 | with a password obtained by calling password() with the QUrl::FullyDecoded |
2218 | formatting option. |
2219 | |
2220 | \sa password(), setUserInfo() |
2221 | */ |
2222 | void QUrl::setPassword(const QString &password, ParsingMode mode) |
2223 | { |
2224 | detach(); |
2225 | d->clearError(); |
2226 | |
2227 | QString data = password; |
2228 | if (mode == DecodedMode) { |
2229 | parseDecodedComponent(data); |
2230 | mode = TolerantMode; |
2231 | } |
2232 | |
2233 | d->setPassword(data, 0, data.length()); |
2234 | if (password.isNull()) |
2235 | d->sectionIsPresent &= ~QUrlPrivate::Password; |
2236 | else if (mode == StrictMode && !d->validateComponent(QUrlPrivate::Password, password)) |
2237 | d->password.clear(); |
2238 | } |
2239 | |
2240 | /*! |
2241 | Returns the password of the URL if it is defined; otherwise |
2242 | an empty string is returned. |
2243 | |
2244 | The \a options argument controls how to format the user name component. All |
2245 | values produce an unambiguous result. With QUrl::FullyDecoded, all |
2246 | percent-encoded sequences are decoded; otherwise, the returned value may |
2247 | contain some percent-encoded sequences for some control sequences not |
2248 | representable in decoded form in QString. |
2249 | |
2250 | Note that QUrl::FullyDecoded may cause data loss if those non-representable |
2251 | sequences are present. It is recommended to use that value when the result |
2252 | will be used in a non-URL context, such as setting in QAuthenticator or |
2253 | negotiating a login. |
2254 | |
2255 | \sa setPassword() |
2256 | */ |
2257 | QString QUrl::password(ComponentFormattingOptions options) const |
2258 | { |
2259 | QString result; |
2260 | if (d) |
2261 | d->appendPassword(result, options); |
2262 | return result; |
2263 | } |
2264 | |
2265 | /*! |
2266 | Sets the host of the URL to \a host. The host is part of the |
2267 | authority. |
2268 | |
2269 | The \a host data is interpreted according to \a mode: in StrictMode, |
2270 | any '%' characters must be followed by exactly two hexadecimal characters |
2271 | and some characters (including space) are not allowed in undecoded form. In |
2272 | TolerantMode, all characters are accepted in undecoded form and the |
2273 | tolerant parser will correct stray '%' not followed by two hex characters. |
2274 | In DecodedMode, '%' stand for themselves and encoded characters are not |
2275 | possible. |
2276 | |
2277 | Note that, in all cases, the result of the parsing must be a valid hostname |
2278 | according to STD 3 rules, as modified by the Internationalized Resource |
2279 | Identifiers specification (RFC 3987). Invalid hostnames are not permitted |
2280 | and will cause isValid() to become false. |
2281 | |
2282 | \sa host(), setAuthority() |
2283 | */ |
2284 | void QUrl::setHost(const QString &host, ParsingMode mode) |
2285 | { |
2286 | detach(); |
2287 | d->clearError(); |
2288 | |
2289 | QString data = host; |
2290 | if (mode == DecodedMode) { |
2291 | parseDecodedComponent(data); |
2292 | mode = TolerantMode; |
2293 | } |
2294 | |
2295 | if (d->setHost(data, 0, data.length(), mode)) { |
2296 | if (host.isNull()) |
2297 | d->sectionIsPresent &= ~QUrlPrivate::Host; |
2298 | } else if (!data.startsWith(QLatin1Char('['))) { |
2299 | // setHost failed, it might be IPv6 or IPvFuture in need of bracketing |
2300 | Q_ASSERT(d->error); |
2301 | |
2302 | data.prepend(QLatin1Char('[')); |
2303 | data.append(QLatin1Char(']')); |
2304 | if (!d->setHost(data, 0, data.length(), mode)) { |
2305 | // failed again |
2306 | if (data.contains(QLatin1Char(':'))) { |
2307 | // source data contains ':', so it's an IPv6 error |
2308 | d->error->code = QUrlPrivate::InvalidIPv6AddressError; |
2309 | } |
2310 | } else { |
2311 | // succeeded |
2312 | d->clearError(); |
2313 | } |
2314 | } |
2315 | } |
2316 | |
2317 | /*! |
2318 | Returns the host of the URL if it is defined; otherwise |
2319 | an empty string is returned. |
2320 | |
2321 | The \a options argument controls how the hostname will be formatted. The |
2322 | QUrl::EncodeUnicode option will cause this function to return the hostname |
2323 | in the ASCII-Compatible Encoding (ACE) form, which is suitable for use in |
2324 | channels that are not 8-bit clean or that require the legacy hostname (such |
2325 | as DNS requests or in HTTP request headers). If that flag is not present, |
2326 | this function returns the International Domain Name (IDN) in Unicode form, |
2327 | according to the list of permissible top-level domains (see |
2328 | idnWhitelist()). |
2329 | |
2330 | All other flags are ignored. Host names cannot contain control or percent |
2331 | characters, so the returned value can be considered fully decoded. |
2332 | |
2333 | \sa setHost(), idnWhitelist(), setIdnWhitelist(), authority() |
2334 | */ |
2335 | QString QUrl::host(ComponentFormattingOptions options) const |
2336 | { |
2337 | QString result; |
2338 | if (d) { |
2339 | d->appendHost(result, options); |
2340 | if (result.startsWith(QLatin1Char('['))) |
2341 | result = result.mid(1, result.length() - 2); |
2342 | } |
2343 | return result; |
2344 | } |
2345 | |
2346 | /*! |
2347 | Sets the port of the URL to \a port. The port is part of the |
2348 | authority of the URL, as described in setAuthority(). |
2349 | |
2350 | \a port must be between 0 and 65535 inclusive. Setting the |
2351 | port to -1 indicates that the port is unspecified. |
2352 | */ |
2353 | void QUrl::setPort(int port) |
2354 | { |
2355 | detach(); |
2356 | d->clearError(); |
2357 | |
2358 | if (port < -1 || port > 65535) { |
2359 | d->setError(QUrlPrivate::InvalidPortError, QString::number(port), 0); |
2360 | port = -1; |
2361 | } |
2362 | |
2363 | d->port = port; |
2364 | if (port != -1) |
2365 | d->sectionIsPresent |= QUrlPrivate::Host; |
2366 | } |
2367 | |
2368 | /*! |
2369 | \since 4.1 |
2370 | |
2371 | Returns the port of the URL, or \a defaultPort if the port is |
2372 | unspecified. |
2373 | |
2374 | Example: |
2375 | |
2376 | \snippet code/src_corelib_io_qurl.cpp 3 |
2377 | */ |
2378 | int QUrl::port(int defaultPort) const |
2379 | { |
2380 | if (!d) return defaultPort; |
2381 | return d->port == -1 ? defaultPort : d->port; |
2382 | } |
2383 | |
2384 | /*! |
2385 | Sets the path of the URL to \a path. The path is the part of the |
2386 | URL that comes after the authority but before the query string. |
2387 | |
2388 | \image qurl-ftppath.png |
2389 | |
2390 | For non-hierarchical schemes, the path will be everything |
2391 | following the scheme declaration, as in the following example: |
2392 | |
2393 | \image qurl-mailtopath.png |
2394 | |
2395 | The \a path data is interpreted according to \a mode: in StrictMode, |
2396 | any '%' characters must be followed by exactly two hexadecimal characters |
2397 | and some characters (including space) are not allowed in undecoded form. In |
2398 | TolerantMode, all characters are accepted in undecoded form and the |
2399 | tolerant parser will correct stray '%' not followed by two hex characters. |
2400 | In DecodedMode, '%' stand for themselves and encoded characters are not |
2401 | possible. |
2402 | |
2403 | QUrl::DecodedMode should be used when setting the path from a data source |
2404 | which is not a URL, such as a dialog shown to the user or with a path |
2405 | obtained by calling path() with the QUrl::FullyDecoded formatting option. |
2406 | |
2407 | \sa path() |
2408 | */ |
2409 | void QUrl::setPath(const QString &path, ParsingMode mode) |
2410 | { |
2411 | detach(); |
2412 | d->clearError(); |
2413 | |
2414 | QString data = path; |
2415 | if (mode == DecodedMode) { |
2416 | parseDecodedComponent(data); |
2417 | mode = TolerantMode; |
2418 | } |
2419 | |
2420 | d->setPath(data, 0, data.length()); |
2421 | |
2422 | // optimized out, since there is no path delimiter |
2423 | // if (path.isNull()) |
2424 | // d->sectionIsPresent &= ~QUrlPrivate::Path; |
2425 | // else |
2426 | if (mode == StrictMode && !d->validateComponent(QUrlPrivate::Path, path)) |
2427 | d->path.clear(); |
2428 | } |
2429 | |
2430 | /*! |
2431 | Returns the path of the URL. |
2432 | |
2433 | \snippet code/src_corelib_io_qurl.cpp 12 |
2434 | |
2435 | The \a options argument controls how to format the path component. All |
2436 | values produce an unambiguous result. With QUrl::FullyDecoded, all |
2437 | percent-encoded sequences are decoded; otherwise, the returned value may |
2438 | contain some percent-encoded sequences for some control sequences not |
2439 | representable in decoded form in QString. |
2440 | |
2441 | Note that QUrl::FullyDecoded may cause data loss if those non-representable |
2442 | sequences are present. It is recommended to use that value when the result |
2443 | will be used in a non-URL context, such as sending to an FTP server. |
2444 | |
2445 | An example of data loss is when you have non-Unicode percent-encoded sequences |
2446 | and use FullyDecoded (the default): |
2447 | |
2448 | \snippet code/src_corelib_io_qurl.cpp 13 |
2449 | |
2450 | In this example, there will be some level of data loss because the \c %FF cannot |
2451 | be converted. |
2452 | |
2453 | Data loss can also occur when the path contains sub-delimiters (such as \c +): |
2454 | |
2455 | \snippet code/src_corelib_io_qurl.cpp 14 |
2456 | |
2457 | Other decoding examples: |
2458 | |
2459 | \snippet code/src_corelib_io_qurl.cpp 15 |
2460 | |
2461 | \sa setPath() |
2462 | */ |
2463 | QString QUrl::path(ComponentFormattingOptions options) const |
2464 | { |
2465 | QString result; |
2466 | if (d) |
2467 | d->appendPath(result, options, QUrlPrivate::Path); |
2468 | return result; |
2469 | } |
2470 | |
2471 | /*! |
2472 | \since 5.2 |
2473 | |
2474 | Returns the name of the file, excluding the directory path. |
2475 | |
2476 | Note that, if this QUrl object is given a path ending in a slash, the name of the file is considered empty. |
2477 | |
2478 | If the path doesn't contain any slash, it is fully returned as the fileName. |
2479 | |
2480 | Example: |
2481 | |
2482 | \snippet code/src_corelib_io_qurl.cpp 7 |
2483 | |
2484 | The \a options argument controls how to format the file name component. All |
2485 | values produce an unambiguous result. With QUrl::FullyDecoded, all |
2486 | percent-encoded sequences are decoded; otherwise, the returned value may |
2487 | contain some percent-encoded sequences for some control sequences not |
2488 | representable in decoded form in QString. |
2489 | |
2490 | \sa path() |
2491 | */ |
2492 | QString QUrl::fileName(ComponentFormattingOptions options) const |
2493 | { |
2494 | const QString ourPath = path(options); |
2495 | const int slash = ourPath.lastIndexOf(QLatin1Char('/')); |
2496 | if (slash == -1) |
2497 | return ourPath; |
2498 | return ourPath.mid(slash + 1); |
2499 | } |
2500 | |
2501 | /*! |
2502 | \since 4.2 |
2503 | |
2504 | Returns \c true if this URL contains a Query (i.e., if ? was seen on it). |
2505 | |
2506 | \sa setQuery(), query(), hasFragment() |
2507 | */ |
2508 | bool QUrl::hasQuery() const |
2509 | { |
2510 | if (!d) return false; |
2511 | return d->hasQuery(); |
2512 | } |
2513 | |
2514 | /*! |
2515 | Sets the query string of the URL to \a query. |
2516 | |
2517 | This function is useful if you need to pass a query string that |
2518 | does not fit into the key-value pattern, or that uses a different |
2519 | scheme for encoding special characters than what is suggested by |
2520 | QUrl. |
2521 | |
2522 | Passing a value of QString() to \a query (a null QString) unsets |
2523 | the query completely. However, passing a value of QString("") |
2524 | will set the query to an empty value, as if the original URL |
2525 | had a lone "?". |
2526 | |
2527 | The \a query data is interpreted according to \a mode: in StrictMode, |
2528 | any '%' characters must be followed by exactly two hexadecimal characters |
2529 | and some characters (including space) are not allowed in undecoded form. In |
2530 | TolerantMode, all characters are accepted in undecoded form and the |
2531 | tolerant parser will correct stray '%' not followed by two hex characters. |
2532 | In DecodedMode, '%' stand for themselves and encoded characters are not |
2533 | possible. |
2534 | |
2535 | Query strings often contain percent-encoded sequences, so use of |
2536 | DecodedMode is discouraged. One special sequence to be aware of is that of |
2537 | the plus character ('+'). QUrl does not convert spaces to plus characters, |
2538 | even though HTML forms posted by web browsers do. In order to represent an |
2539 | actual plus character in a query, the sequence "%2B" is usually used. This |
2540 | function will leave "%2B" sequences untouched in TolerantMode or |
2541 | StrictMode. |
2542 | |
2543 | \sa query(), hasQuery() |
2544 | */ |
2545 | void QUrl::setQuery(const QString &query, ParsingMode mode) |
2546 | { |
2547 | detach(); |
2548 | d->clearError(); |
2549 | |
2550 | QString data = query; |
2551 | if (mode == DecodedMode) { |
2552 | parseDecodedComponent(data); |
2553 | mode = TolerantMode; |
2554 | } |
2555 | |
2556 | d->setQuery(data, 0, data.length()); |
2557 | if (query.isNull()) |
2558 | d->sectionIsPresent &= ~QUrlPrivate::Query; |
2559 | else if (mode == StrictMode && !d->validateComponent(QUrlPrivate::Query, query)) |
2560 | d->query.clear(); |
2561 | } |
2562 | |
2563 | /*! |
2564 | \overload |
2565 | \since 5.0 |
2566 | Sets the query string of the URL to \a query. |
2567 | |
2568 | This function reconstructs the query string from the QUrlQuery object and |
2569 | sets on this QUrl object. This function does not have parsing parameters |
2570 | because the QUrlQuery contains data that is already parsed. |
2571 | |
2572 | \sa query(), hasQuery() |
2573 | */ |
2574 | void QUrl::setQuery(const QUrlQuery &query) |
2575 | { |
2576 | detach(); |
2577 | d->clearError(); |
2578 | |
2579 | // we know the data is in the right format |
2580 | d->query = query.toString(); |
2581 | if (query.isEmpty()) |
2582 | d->sectionIsPresent &= ~QUrlPrivate::Query; |
2583 | else |
2584 | d->sectionIsPresent |= QUrlPrivate::Query; |
2585 | } |
2586 | |
2587 | /*! |
2588 | Returns the query string of the URL if there's a query string, or an empty |
2589 | result if not. To determine if the parsed URL contained a query string, use |
2590 | hasQuery(). |
2591 | |
2592 | The \a options argument controls how to format the query component. All |
2593 | values produce an unambiguous result. With QUrl::FullyDecoded, all |
2594 | percent-encoded sequences are decoded; otherwise, the returned value may |
2595 | contain some percent-encoded sequences for some control sequences not |
2596 | representable in decoded form in QString. |
2597 | |
2598 | Note that use of QUrl::FullyDecoded in queries is discouraged, as queries |
2599 | often contain data that is supposed to remain percent-encoded, including |
2600 | the use of the "%2B" sequence to represent a plus character ('+'). |
2601 | |
2602 | \sa setQuery(), hasQuery() |
2603 | */ |
2604 | QString QUrl::query(ComponentFormattingOptions options) const |
2605 | { |
2606 | QString result; |
2607 | if (d) { |
2608 | d->appendQuery(result, options, QUrlPrivate::Query); |
2609 | if (d->hasQuery() && result.isNull()) |
2610 | result.detach(); |
2611 | } |
2612 | return result; |
2613 | } |
2614 | |
2615 | /*! |
2616 | Sets the fragment of the URL to \a fragment. The fragment is the |
2617 | last part of the URL, represented by a '#' followed by a string of |
2618 | characters. It is typically used in HTTP for referring to a |
2619 | certain link or point on a page: |
2620 | |
2621 | \image qurl-fragment.png |
2622 | |
2623 | The fragment is sometimes also referred to as the URL "reference". |
2624 | |
2625 | Passing an argument of QString() (a null QString) will unset the fragment. |
2626 | Passing an argument of QString("") (an empty but not null QString) will set the |
2627 | fragment to an empty string (as if the original URL had a lone "#"). |
2628 | |
2629 | The \a fragment data is interpreted according to \a mode: in StrictMode, |
2630 | any '%' characters must be followed by exactly two hexadecimal characters |
2631 | and some characters (including space) are not allowed in undecoded form. In |
2632 | TolerantMode, all characters are accepted in undecoded form and the |
2633 | tolerant parser will correct stray '%' not followed by two hex characters. |
2634 | In DecodedMode, '%' stand for themselves and encoded characters are not |
2635 | possible. |
2636 | |
2637 | QUrl::DecodedMode should be used when setting the fragment from a data |
2638 | source which is not a URL or with a fragment obtained by calling |
2639 | fragment() with the QUrl::FullyDecoded formatting option. |
2640 | |
2641 | \sa fragment(), hasFragment() |
2642 | */ |
2643 | void QUrl::setFragment(const QString &fragment, ParsingMode mode) |
2644 | { |
2645 | detach(); |
2646 | d->clearError(); |
2647 | |
2648 | QString data = fragment; |
2649 | if (mode == DecodedMode) { |
2650 | parseDecodedComponent(data); |
2651 | mode = TolerantMode; |
2652 | } |
2653 | |
2654 | d->setFragment(data, 0, data.length()); |
2655 | if (fragment.isNull()) |
2656 | d->sectionIsPresent &= ~QUrlPrivate::Fragment; |
2657 | else if (mode == StrictMode && !d->validateComponent(QUrlPrivate::Fragment, fragment)) |
2658 | d->fragment.clear(); |
2659 | } |
2660 | |
2661 | /*! |
2662 | Returns the fragment of the URL. To determine if the parsed URL contained a |
2663 | fragment, use hasFragment(). |
2664 | |
2665 | The \a options argument controls how to format the fragment component. All |
2666 | values produce an unambiguous result. With QUrl::FullyDecoded, all |
2667 | percent-encoded sequences are decoded; otherwise, the returned value may |
2668 | contain some percent-encoded sequences for some control sequences not |
2669 | representable in decoded form in QString. |
2670 | |
2671 | Note that QUrl::FullyDecoded may cause data loss if those non-representable |
2672 | sequences are present. It is recommended to use that value when the result |
2673 | will be used in a non-URL context. |
2674 | |
2675 | \sa setFragment(), hasFragment() |
2676 | */ |
2677 | QString QUrl::fragment(ComponentFormattingOptions options) const |
2678 | { |
2679 | QString result; |
2680 | if (d) { |
2681 | d->appendFragment(result, options, QUrlPrivate::Fragment); |
2682 | if (d->hasFragment() && result.isNull()) |
2683 | result.detach(); |
2684 | } |
2685 | return result; |
2686 | } |
2687 | |
2688 | /*! |
2689 | \since 4.2 |
2690 | |
2691 | Returns \c true if this URL contains a fragment (i.e., if # was seen on it). |
2692 | |
2693 | \sa fragment(), setFragment() |
2694 | */ |
2695 | bool QUrl::hasFragment() const |
2696 | { |
2697 | if (!d) return false; |
2698 | return d->hasFragment(); |
2699 | } |
2700 | |
2701 | /*! |
2702 | Returns the result of the merge of this URL with \a relative. This |
2703 | URL is used as a base to convert \a relative to an absolute URL. |
2704 | |
2705 | If \a relative is not a relative URL, this function will return \a |
2706 | relative directly. Otherwise, the paths of the two URLs are |
2707 | merged, and the new URL returned has the scheme and authority of |
2708 | the base URL, but with the merged path, as in the following |
2709 | example: |
2710 | |
2711 | \snippet code/src_corelib_io_qurl.cpp 5 |
2712 | |
2713 | Calling resolved() with ".." returns a QUrl whose directory is |
2714 | one level higher than the original. Similarly, calling resolved() |
2715 | with "../.." removes two levels from the path. If \a relative is |
2716 | "/", the path becomes "/". |
2717 | |
2718 | \sa isRelative() |
2719 | */ |
2720 | QUrl QUrl::resolved(const QUrl &relative) const |
2721 | { |
2722 | if (!d) return relative; |
2723 | if (!relative.d) return *this; |
2724 | |
2725 | QUrl t; |
2726 | if (!relative.d->scheme.isEmpty()) { |
2727 | t = relative; |
2728 | t.detach(); |
2729 | } else { |
2730 | if (relative.d->hasAuthority()) { |
2731 | t = relative; |
2732 | t.detach(); |
2733 | } else { |
2734 | t.d = new QUrlPrivate; |
2735 | |
2736 | // copy the authority |
2737 | t.d->userName = d->userName; |
2738 | t.d->password = d->password; |
2739 | t.d->host = d->host; |
2740 | t.d->port = d->port; |
2741 | t.d->sectionIsPresent = d->sectionIsPresent & QUrlPrivate::Authority; |
2742 | |
2743 | if (relative.d->path.isEmpty()) { |
2744 | t.d->path = d->path; |
2745 | if (relative.d->hasQuery()) { |
2746 | t.d->query = relative.d->query; |
2747 | t.d->sectionIsPresent |= QUrlPrivate::Query; |
2748 | } else if (d->hasQuery()) { |
2749 | t.d->query = d->query; |
2750 | t.d->sectionIsPresent |= QUrlPrivate::Query; |
2751 | } |
2752 | } else { |
2753 | t.d->path = relative.d->path.startsWith(QLatin1Char('/')) |
2754 | ? relative.d->path |
2755 | : d->mergePaths(relative.d->path); |
2756 | if (relative.d->hasQuery()) { |
2757 | t.d->query = relative.d->query; |
2758 | t.d->sectionIsPresent |= QUrlPrivate::Query; |
2759 | } |
2760 | } |
2761 | } |
2762 | t.d->scheme = d->scheme; |
2763 | if (d->hasScheme()) |
2764 | t.d->sectionIsPresent |= QUrlPrivate::Scheme; |
2765 | else |
2766 | t.d->sectionIsPresent &= ~QUrlPrivate::Scheme; |
2767 | t.d->flags |= d->flags & QUrlPrivate::IsLocalFile; |
2768 | } |
2769 | t.d->fragment = relative.d->fragment; |
2770 | if (relative.d->hasFragment()) |
2771 | t.d->sectionIsPresent |= QUrlPrivate::Fragment; |
2772 | else |
2773 | t.d->sectionIsPresent &= ~QUrlPrivate::Fragment; |
2774 | |
2775 | removeDotsFromPath(&t.d->path); |
2776 | |
2777 | #if defined(QURL_DEBUG) |
2778 | qDebug("QUrl(\"%ls\").resolved(\"%ls\") = \"%ls\"" , |
2779 | qUtf16Printable(url()), |
2780 | qUtf16Printable(relative.url()), |
2781 | qUtf16Printable(t.url())); |
2782 | #endif |
2783 | return t; |
2784 | } |
2785 | |
2786 | /*! |
2787 | Returns \c true if the URL is relative; otherwise returns \c false. A URL is |
2788 | relative reference if its scheme is undefined; this function is therefore |
2789 | equivalent to calling scheme().isEmpty(). |
2790 | |
2791 | Relative references are defined in RFC 3986 section 4.2. |
2792 | |
2793 | \sa {Relative URLs vs Relative Paths} |
2794 | */ |
2795 | bool QUrl::isRelative() const |
2796 | { |
2797 | if (!d) return true; |
2798 | return !d->hasScheme(); |
2799 | } |
2800 | |
2801 | /*! |
2802 | Returns a string representation of the URL. The output can be customized by |
2803 | passing flags with \a options. The option QUrl::FullyDecoded is not |
2804 | permitted in this function since it would generate ambiguous data. |
2805 | |
2806 | The resulting QString can be passed back to a QUrl later on. |
2807 | |
2808 | Synonym for toString(options). |
2809 | |
2810 | \sa FormattingOptions, toEncoded(), toString() |
2811 | */ |
2812 | QString QUrl::url(FormattingOptions options) const |
2813 | { |
2814 | return toString(options); |
2815 | } |
2816 | |
2817 | /*! |
2818 | Returns a string representation of the URL. The output can be customized by |
2819 | passing flags with \a options. The option QUrl::FullyDecoded is not |
2820 | permitted in this function since it would generate ambiguous data. |
2821 | |
2822 | The default formatting option is \l{QUrl::FormattingOptions}{PrettyDecoded}. |
2823 | |
2824 | \sa FormattingOptions, url(), setUrl() |
2825 | */ |
2826 | QString QUrl::toString(FormattingOptions options) const |
2827 | { |
2828 | QString url; |
2829 | if (!isValid()) { |
2830 | // also catches isEmpty() |
2831 | return url; |
2832 | } |
2833 | if ((options & QUrl::FullyDecoded) == QUrl::FullyDecoded) { |
2834 | qWarning("QUrl: QUrl::FullyDecoded is not permitted when reconstructing the full URL" ); |
2835 | options &= ~QUrl::FullyDecoded; |
2836 | //options |= QUrl::PrettyDecoded; // no-op, value is 0 |
2837 | } |
2838 | |
2839 | // return just the path if: |
2840 | // - QUrl::PreferLocalFile is passed |
2841 | // - QUrl::RemovePath isn't passed (rather stupid if the user did...) |
2842 | // - there's no query or fragment to return |
2843 | // that is, either they aren't present, or we're removing them |
2844 | // - it's a local file |
2845 | if (options.testFlag(QUrl::PreferLocalFile) && !options.testFlag(QUrl::RemovePath) |
2846 | && (!d->hasQuery() || options.testFlag(QUrl::RemoveQuery)) |
2847 | && (!d->hasFragment() || options.testFlag(QUrl::RemoveFragment)) |
2848 | && isLocalFile()) { |
2849 | url = d->toLocalFile(options | QUrl::FullyDecoded); |
2850 | return url; |
2851 | } |
2852 | |
2853 | // for the full URL, we consider that the reserved characters are prettier if encoded |
2854 | if (options & DecodeReserved) |
2855 | options &= ~EncodeReserved; |
2856 | else |
2857 | options |= EncodeReserved; |
2858 | |
2859 | if (!(options & QUrl::RemoveScheme) && d->hasScheme()) |
2860 | url += d->scheme + QLatin1Char(':'); |
2861 | |
2862 | bool pathIsAbsolute = d->path.startsWith(QLatin1Char('/')); |
2863 | if (!((options & QUrl::RemoveAuthority) == QUrl::RemoveAuthority) && d->hasAuthority()) { |
2864 | url += QLatin1String("//" ); |
2865 | d->appendAuthority(url, options, QUrlPrivate::FullUrl); |
2866 | } else if (isLocalFile() && pathIsAbsolute) { |
2867 | // Comply with the XDG file URI spec, which requires triple slashes. |
2868 | url += QLatin1String("//" ); |
2869 | } |
2870 | |
2871 | if (!(options & QUrl::RemovePath)) |
2872 | d->appendPath(url, options, QUrlPrivate::FullUrl); |
2873 | |
2874 | if (!(options & QUrl::RemoveQuery) && d->hasQuery()) { |
2875 | url += QLatin1Char('?'); |
2876 | d->appendQuery(url, options, QUrlPrivate::FullUrl); |
2877 | } |
2878 | if (!(options & QUrl::RemoveFragment) && d->hasFragment()) { |
2879 | url += QLatin1Char('#'); |
2880 | d->appendFragment(url, options, QUrlPrivate::FullUrl); |
2881 | } |
2882 | |
2883 | return url; |
2884 | } |
2885 | |
2886 | /*! |
2887 | \since 5.0 |
2888 | |
2889 | Returns a human-displayable string representation of the URL. |
2890 | The output can be customized by passing flags with \a options. |
2891 | The option RemovePassword is always enabled, since passwords |
2892 | should never be shown back to users. |
2893 | |
2894 | With the default options, the resulting QString can be passed back |
2895 | to a QUrl later on, but any password that was present initially will |
2896 | be lost. |
2897 | |
2898 | \sa FormattingOptions, toEncoded(), toString() |
2899 | */ |
2900 | |
2901 | QString QUrl::toDisplayString(FormattingOptions options) const |
2902 | { |
2903 | return toString(options | RemovePassword); |
2904 | } |
2905 | |
2906 | /*! |
2907 | \since 5.2 |
2908 | |
2909 | Returns an adjusted version of the URL. |
2910 | The output can be customized by passing flags with \a options. |
2911 | |
2912 | The encoding options from QUrl::ComponentFormattingOption don't make |
2913 | much sense for this method, nor does QUrl::PreferLocalFile. |
2914 | |
2915 | This is always equivalent to QUrl(url.toString(options)). |
2916 | |
2917 | \sa FormattingOptions, toEncoded(), toString() |
2918 | */ |
2919 | QUrl QUrl::adjusted(QUrl::FormattingOptions options) const |
2920 | { |
2921 | if (!isValid()) { |
2922 | // also catches isEmpty() |
2923 | return QUrl(); |
2924 | } |
2925 | QUrl that = *this; |
2926 | if (options & RemoveScheme) |
2927 | that.setScheme(QString()); |
2928 | if ((options & RemoveAuthority) == RemoveAuthority) { |
2929 | that.setAuthority(QString()); |
2930 | } else { |
2931 | if ((options & RemoveUserInfo) == RemoveUserInfo) |
2932 | that.setUserInfo(QString()); |
2933 | else if (options & RemovePassword) |
2934 | that.setPassword(QString()); |
2935 | if (options & RemovePort) |
2936 | that.setPort(-1); |
2937 | } |
2938 | if (options & RemoveQuery) |
2939 | that.setQuery(QString()); |
2940 | if (options & RemoveFragment) |
2941 | that.setFragment(QString()); |
2942 | if (options & RemovePath) { |
2943 | that.setPath(QString()); |
2944 | } else if (options & (StripTrailingSlash | RemoveFilename | NormalizePathSegments)) { |
2945 | that.detach(); |
2946 | QString path; |
2947 | d->appendPath(path, options | FullyEncoded, QUrlPrivate::Path); |
2948 | that.d->setPath(path, 0, path.length()); |
2949 | } |
2950 | return that; |
2951 | } |
2952 | |
2953 | /*! |
2954 | Returns the encoded representation of the URL if it's valid; |
2955 | otherwise an empty QByteArray is returned. The output can be |
2956 | customized by passing flags with \a options. |
2957 | |
2958 | The user info, path and fragment are all converted to UTF-8, and |
2959 | all non-ASCII characters are then percent encoded. The host name |
2960 | is encoded using Punycode. |
2961 | */ |
2962 | QByteArray QUrl::toEncoded(FormattingOptions options) const |
2963 | { |
2964 | options &= ~(FullyDecoded | FullyEncoded); |
2965 | return toString(options | FullyEncoded).toLatin1(); |
2966 | } |
2967 | |
2968 | /*! |
2969 | \fn QUrl QUrl::fromEncoded(const QByteArray &input, ParsingMode parsingMode) |
2970 | |
2971 | Parses \a input and returns the corresponding QUrl. \a input is |
2972 | assumed to be in encoded form, containing only ASCII characters. |
2973 | |
2974 | Parses the URL using \a parsingMode. See setUrl() for more information on |
2975 | this parameter. QUrl::DecodedMode is not permitted in this context. |
2976 | |
2977 | \sa toEncoded(), setUrl() |
2978 | */ |
2979 | QUrl QUrl::fromEncoded(const QByteArray &input, ParsingMode mode) |
2980 | { |
2981 | return QUrl(QString::fromUtf8(input.constData(), input.size()), mode); |
2982 | } |
2983 | |
2984 | /*! |
2985 | Returns a decoded copy of \a input. \a input is first decoded from |
2986 | percent encoding, then converted from UTF-8 to unicode. |
2987 | |
2988 | \note Given invalid input (such as a string containing the sequence "%G5", |
2989 | which is not a valid hexadecimal number) the output will be invalid as |
2990 | well. As an example: the sequence "%G5" could be decoded to 'W'. |
2991 | */ |
2992 | QString QUrl::fromPercentEncoding(const QByteArray &input) |
2993 | { |
2994 | QByteArray ba = QByteArray::fromPercentEncoding(input); |
2995 | return QString::fromUtf8(ba, ba.size()); |
2996 | } |
2997 | |
2998 | /*! |
2999 | Returns an encoded copy of \a input. \a input is first converted |
3000 | to UTF-8, and all ASCII-characters that are not in the unreserved group |
3001 | are percent encoded. To prevent characters from being percent encoded |
3002 | pass them to \a exclude. To force characters to be percent encoded pass |
3003 | them to \a include. |
3004 | |
3005 | Unreserved is defined as: |
3006 | \tt {ALPHA / DIGIT / "-" / "." / "_" / "~"} |
3007 | |
3008 | \snippet code/src_corelib_io_qurl.cpp 6 |
3009 | */ |
3010 | QByteArray QUrl::toPercentEncoding(const QString &input, const QByteArray &exclude, const QByteArray &include) |
3011 | { |
3012 | return input.toUtf8().toPercentEncoding(exclude, include); |
3013 | } |
3014 | |
3015 | /*! |
3016 | \since 4.2 |
3017 | |
3018 | Returns the Unicode form of the given domain name |
3019 | \a domain, which is encoded in the ASCII Compatible Encoding (ACE). |
3020 | The result of this function is considered equivalent to \a domain. |
3021 | |
3022 | If the value in \a domain cannot be encoded, it will be converted |
3023 | to QString and returned. |
3024 | |
3025 | The ASCII Compatible Encoding (ACE) is defined by RFC 3490, RFC 3491 |
3026 | and RFC 3492. It is part of the Internationalizing Domain Names in |
3027 | Applications (IDNA) specification, which allows for domain names |
3028 | (like \c "example.com") to be written using international |
3029 | characters. |
3030 | */ |
3031 | QString QUrl::fromAce(const QByteArray &domain) |
3032 | { |
3033 | QVarLengthArray<char16_t> buffer; |
3034 | buffer.resize(domain.size()); |
3035 | qt_from_latin1(buffer.data(), domain.data(), domain.size()); |
3036 | return qt_ACE_do(QStringView{buffer.data(), buffer.size()}, |
3037 | NormalizeAce, ForbidLeadingDot /*FIXME: make configurable*/); |
3038 | } |
3039 | |
3040 | /*! |
3041 | \since 4.2 |
3042 | |
3043 | Returns the ASCII Compatible Encoding of the given domain name \a domain. |
3044 | The result of this function is considered equivalent to \a domain. |
3045 | |
3046 | The ASCII-Compatible Encoding (ACE) is defined by RFC 3490, RFC 3491 |
3047 | and RFC 3492. It is part of the Internationalizing Domain Names in |
3048 | Applications (IDNA) specification, which allows for domain names |
3049 | (like \c "example.com") to be written using international |
3050 | characters. |
3051 | |
3052 | This function returns an empty QByteArray if \a domain is not a valid |
3053 | hostname. Note, in particular, that IPv6 literals are not valid domain |
3054 | names. |
3055 | */ |
3056 | QByteArray QUrl::toAce(const QString &domain) |
3057 | { |
3058 | return qt_ACE_do(domain, ToAceOnly, ForbidLeadingDot /*FIXME: make configurable*/).toLatin1(); |
3059 | } |
3060 | |
3061 | /*! |
3062 | \internal |
3063 | |
3064 | Returns \c true if this URL is "less than" the given \a url. This |
3065 | provides a means of ordering URLs. |
3066 | */ |
3067 | bool QUrl::operator <(const QUrl &url) const |
3068 | { |
3069 | if (!d || !url.d) { |
3070 | bool thisIsEmpty = !d || d->isEmpty(); |
3071 | bool thatIsEmpty = !url.d || url.d->isEmpty(); |
3072 | |
3073 | // sort an empty URL first |
3074 | return thisIsEmpty && !thatIsEmpty; |
3075 | } |
3076 | |
3077 | int cmp; |
3078 | cmp = d->scheme.compare(url.d->scheme); |
3079 | if (cmp != 0) |
3080 | return cmp < 0; |
3081 | |
3082 | cmp = d->userName.compare(url.d->userName); |
3083 | if (cmp != 0) |
3084 | return cmp < 0; |
3085 | |
3086 | cmp = d->password.compare(url.d->password); |
3087 | if (cmp != 0) |
3088 | return cmp < 0; |
3089 | |
3090 | cmp = d->host.compare(url.d->host); |
3091 | if (cmp != 0) |
3092 | return cmp < 0; |
3093 | |
3094 | if (d->port != url.d->port) |
3095 | return d->port < url.d->port; |
3096 | |
3097 | cmp = d->path.compare(url.d->path); |
3098 | if (cmp != 0) |
3099 | return cmp < 0; |
3100 | |
3101 | if (d->hasQuery() != url.d->hasQuery()) |
3102 | return url.d->hasQuery(); |
3103 | |
3104 | cmp = d->query.compare(url.d->query); |
3105 | if (cmp != 0) |
3106 | return cmp < 0; |
3107 | |
3108 | if (d->hasFragment() != url.d->hasFragment()) |
3109 | return url.d->hasFragment(); |
3110 | |
3111 | cmp = d->fragment.compare(url.d->fragment); |
3112 | return cmp < 0; |
3113 | } |
3114 | |
3115 | /*! |
3116 | Returns \c true if this URL and the given \a url are equal; |
3117 | otherwise returns \c false. |
3118 | */ |
3119 | bool QUrl::operator ==(const QUrl &url) const |
3120 | { |
3121 | if (!d && !url.d) |
3122 | return true; |
3123 | if (!d) |
3124 | return url.d->isEmpty(); |
3125 | if (!url.d) |
3126 | return d->isEmpty(); |
3127 | |
3128 | // First, compare which sections are present, since it speeds up the |
3129 | // processing considerably. We just have to ignore the host-is-present flag |
3130 | // for local files (the "file" protocol), due to the requirements of the |
3131 | // XDG file URI specification. |
3132 | int mask = QUrlPrivate::FullUrl; |
3133 | if (isLocalFile()) |
3134 | mask &= ~QUrlPrivate::Host; |
3135 | return (d->sectionIsPresent & mask) == (url.d->sectionIsPresent & mask) && |
3136 | d->scheme == url.d->scheme && |
3137 | d->userName == url.d->userName && |
3138 | d->password == url.d->password && |
3139 | d->host == url.d->host && |
3140 | d->port == url.d->port && |
3141 | d->path == url.d->path && |
3142 | d->query == url.d->query && |
3143 | d->fragment == url.d->fragment; |
3144 | } |
3145 | |
3146 | /*! |
3147 | \since 5.2 |
3148 | |
3149 | Returns \c true if this URL and the given \a url are equal after |
3150 | applying \a options to both; otherwise returns \c false. |
3151 | |
3152 | This is equivalent to calling adjusted(options) on both URLs |
3153 | and comparing the resulting urls, but faster. |
3154 | |
3155 | */ |
3156 | bool QUrl::matches(const QUrl &url, FormattingOptions options) const |
3157 | { |
3158 | if (!d && !url.d) |
3159 | return true; |
3160 | if (!d) |
3161 | return url.d->isEmpty(); |
3162 | if (!url.d) |
3163 | return d->isEmpty(); |
3164 | |
3165 | // First, compare which sections are present, since it speeds up the |
3166 | // processing considerably. We just have to ignore the host-is-present flag |
3167 | // for local files (the "file" protocol), due to the requirements of the |
3168 | // XDG file URI specification. |
3169 | int mask = QUrlPrivate::FullUrl; |
3170 | if (isLocalFile()) |
3171 | mask &= ~QUrlPrivate::Host; |
3172 | |
3173 | if (options.testFlag(QUrl::RemoveScheme)) |
3174 | mask &= ~QUrlPrivate::Scheme; |
3175 | else if (d->scheme != url.d->scheme) |
3176 | return false; |
3177 | |
3178 | if (options.testFlag(QUrl::RemovePassword)) |
3179 | mask &= ~QUrlPrivate::Password; |
3180 | else if (d->password != url.d->password) |
3181 | return false; |
3182 | |
3183 | if (options.testFlag(QUrl::RemoveUserInfo)) |
3184 | mask &= ~QUrlPrivate::UserName; |
3185 | else if (d->userName != url.d->userName) |
3186 | return false; |
3187 | |
3188 | if (options.testFlag(QUrl::RemovePort)) |
3189 | mask &= ~QUrlPrivate::Port; |
3190 | else if (d->port != url.d->port) |
3191 | return false; |
3192 | |
3193 | if (options.testFlag(QUrl::RemoveAuthority)) |
3194 | mask &= ~QUrlPrivate::Host; |
3195 | else if (d->host != url.d->host) |
3196 | return false; |
3197 | |
3198 | if (options.testFlag(QUrl::RemoveQuery)) |
3199 | mask &= ~QUrlPrivate::Query; |
3200 | else if (d->query != url.d->query) |
3201 | return false; |
3202 | |
3203 | if (options.testFlag(QUrl::RemoveFragment)) |
3204 | mask &= ~QUrlPrivate::Fragment; |
3205 | else if (d->fragment != url.d->fragment) |
3206 | return false; |
3207 | |
3208 | if ((d->sectionIsPresent & mask) != (url.d->sectionIsPresent & mask)) |
3209 | return false; |
3210 | |
3211 | if (options.testFlag(QUrl::RemovePath)) |
3212 | return true; |
3213 | |
3214 | // Compare paths, after applying path-related options |
3215 | QString path1; |
3216 | d->appendPath(path1, options, QUrlPrivate::Path); |
3217 | QString path2; |
3218 | url.d->appendPath(path2, options, QUrlPrivate::Path); |
3219 | return path1 == path2; |
3220 | } |
3221 | |
3222 | /*! |
3223 | Returns \c true if this URL and the given \a url are not equal; |
3224 | otherwise returns \c false. |
3225 | */ |
3226 | bool QUrl::operator !=(const QUrl &url) const |
3227 | { |
3228 | return !(*this == url); |
3229 | } |
3230 | |
3231 | /*! |
3232 | Assigns the specified \a url to this object. |
3233 | */ |
3234 | QUrl &QUrl::operator =(const QUrl &url) |
3235 | { |
3236 | if (!d) { |
3237 | if (url.d) { |
3238 | url.d->ref.ref(); |
3239 | d = url.d; |
3240 | } |
3241 | } else { |
3242 | if (url.d) |
3243 | qAtomicAssign(d, url.d); |
3244 | else |
3245 | clear(); |
3246 | } |
3247 | return *this; |
3248 | } |
3249 | |
3250 | /*! |
3251 | Assigns the specified \a url to this object. |
3252 | */ |
3253 | QUrl &QUrl::operator =(const QString &url) |
3254 | { |
3255 | if (url.isEmpty()) { |
3256 | clear(); |
3257 | } else { |
3258 | detach(); |
3259 | d->parse(url, TolerantMode); |
3260 | } |
3261 | return *this; |
3262 | } |
3263 | |
3264 | /*! |
3265 | \fn void QUrl::swap(QUrl &other) |
3266 | \since 4.8 |
3267 | |
3268 | Swaps URL \a other with this URL. This operation is very |
3269 | fast and never fails. |
3270 | */ |
3271 | |
3272 | /*! |
3273 | \internal |
3274 | |
3275 | Forces a detach. |
3276 | */ |
3277 | void QUrl::detach() |
3278 | { |
3279 | if (!d) |
3280 | d = new QUrlPrivate; |
3281 | else |
3282 | qAtomicDetach(d); |
3283 | } |
3284 | |
3285 | /*! |
3286 | \internal |
3287 | */ |
3288 | bool QUrl::isDetached() const |
3289 | { |
3290 | return !d || d->ref.loadRelaxed() == 1; |
3291 | } |
3292 | |
3293 | |
3294 | /*! |
3295 | Returns a QUrl representation of \a localFile, interpreted as a local |
3296 | file. This function accepts paths separated by slashes as well as the |
3297 | native separator for this platform. |
3298 | |
3299 | This function also accepts paths with a doubled leading slash (or |
3300 | backslash) to indicate a remote file, as in |
3301 | "//servername/path/to/file.txt". Note that only certain platforms can |
3302 | actually open this file using QFile::open(). |
3303 | |
3304 | An empty \a localFile leads to an empty URL (since Qt 5.4). |
3305 | |
3306 | \snippet code/src_corelib_io_qurl.cpp 16 |
3307 | |
3308 | In the first line in snippet above, a file URL is constructed from a |
3309 | local, relative path. A file URL with a relative path only makes sense |
3310 | if there is a base URL to resolve it against. For example: |
3311 | |
3312 | \snippet code/src_corelib_io_qurl.cpp 17 |
3313 | |
3314 | To resolve such a URL, it's necessary to remove the scheme beforehand: |
3315 | |
3316 | \snippet code/src_corelib_io_qurl.cpp 18 |
3317 | |
3318 | For this reason, it is better to use a relative URL (that is, no scheme) |
3319 | for relative file paths: |
3320 | |
3321 | \snippet code/src_corelib_io_qurl.cpp 19 |
3322 | |
3323 | \sa toLocalFile(), isLocalFile(), QDir::toNativeSeparators() |
3324 | */ |
3325 | QUrl QUrl::fromLocalFile(const QString &localFile) |
3326 | { |
3327 | QUrl url; |
3328 | if (localFile.isEmpty()) |
3329 | return url; |
3330 | QString scheme = fileScheme(); |
3331 | QString deslashified = QDir::fromNativeSeparators(localFile); |
3332 | |
3333 | // magic for drives on windows |
3334 | if (deslashified.length() > 1 && deslashified.at(1) == QLatin1Char(':') && deslashified.at(0) != QLatin1Char('/')) { |
3335 | deslashified.prepend(QLatin1Char('/')); |
3336 | } else if (deslashified.startsWith(QLatin1String("//" ))) { |
3337 | // magic for shared drive on windows |
3338 | int indexOfPath = deslashified.indexOf(QLatin1Char('/'), 2); |
3339 | QStringView hostSpec = QStringView{deslashified}.mid(2, indexOfPath - 2); |
3340 | // Check for Windows-specific WebDAV specification: "//host@SSL/path". |
3341 | if (hostSpec.endsWith(webDavSslTag(), Qt::CaseInsensitive)) { |
3342 | hostSpec.truncate(hostSpec.size() - 4); |
3343 | scheme = webDavScheme(); |
3344 | } |
3345 | |
3346 | // hosts can't be IPv6 addresses without [], so we can use QUrlPrivate::setHost |
3347 | url.detach(); |
3348 | if (!url.d->setHost(hostSpec.toString(), 0, hostSpec.size(), StrictMode)) { |
3349 | if (url.d->error->code != QUrlPrivate::InvalidRegNameError) |
3350 | return url; |
3351 | |
3352 | // Path hostname is not a valid URL host, so set it entirely in the path |
3353 | // (by leaving deslashified unchanged) |
3354 | } else if (indexOfPath > 2) { |
3355 | deslashified = deslashified.right(deslashified.length() - indexOfPath); |
3356 | } else { |
3357 | deslashified.clear(); |
3358 | } |
3359 | } |
3360 | |
3361 | url.setScheme(scheme); |
3362 | url.setPath(deslashified, DecodedMode); |
3363 | return url; |
3364 | } |
3365 | |
3366 | /*! |
3367 | Returns the path of this URL formatted as a local file path. The path |
3368 | returned will use forward slashes, even if it was originally created |
3369 | from one with backslashes. |
3370 | |
3371 | If this URL contains a non-empty hostname, it will be encoded in the |
3372 | returned value in the form found on SMB networks (for example, |
3373 | "//servername/path/to/file.txt"). |
3374 | |
3375 | \snippet code/src_corelib_io_qurl.cpp 20 |
3376 | |
3377 | Note: if the path component of this URL contains a non-UTF-8 binary |
3378 | sequence (such as %80), the behaviour of this function is undefined. |
3379 | |
3380 | \sa fromLocalFile(), isLocalFile() |
3381 | */ |
3382 | QString QUrl::toLocalFile() const |
3383 | { |
3384 | // the call to isLocalFile() also ensures that we're parsed |
3385 | if (!isLocalFile()) |
3386 | return QString(); |
3387 | |
3388 | return d->toLocalFile(QUrl::FullyDecoded); |
3389 | } |
3390 | |
3391 | /*! |
3392 | \since 4.8 |
3393 | Returns \c true if this URL is pointing to a local file path. A URL is a |
3394 | local file path if the scheme is "file". |
3395 | |
3396 | Note that this function considers URLs with hostnames to be local file |
3397 | paths, even if the eventual file path cannot be opened with |
3398 | QFile::open(). |
3399 | |
3400 | \sa fromLocalFile(), toLocalFile() |
3401 | */ |
3402 | bool QUrl::isLocalFile() const |
3403 | { |
3404 | return d && d->isLocalFile(); |
3405 | } |
3406 | |
3407 | /*! |
3408 | Returns \c true if this URL is a parent of \a childUrl. \a childUrl is a child |
3409 | of this URL if the two URLs share the same scheme and authority, |
3410 | and this URL's path is a parent of the path of \a childUrl. |
3411 | */ |
3412 | bool QUrl::isParentOf(const QUrl &childUrl) const |
3413 | { |
3414 | QString childPath = childUrl.path(); |
3415 | |
3416 | if (!d) |
3417 | return ((childUrl.scheme().isEmpty()) |
3418 | && (childUrl.authority().isEmpty()) |
3419 | && childPath.length() > 0 && childPath.at(0) == QLatin1Char('/')); |
3420 | |
3421 | QString ourPath = path(); |
3422 | |
3423 | return ((childUrl.scheme().isEmpty() || d->scheme == childUrl.scheme()) |
3424 | && (childUrl.authority().isEmpty() || authority() == childUrl.authority()) |
3425 | && childPath.startsWith(ourPath) |
3426 | && ((ourPath.endsWith(QLatin1Char('/')) && childPath.length() > ourPath.length()) |
3427 | || (!ourPath.endsWith(QLatin1Char('/')) |
3428 | && childPath.length() > ourPath.length() && childPath.at(ourPath.length()) == QLatin1Char('/')))); |
3429 | } |
3430 | |
3431 | |
3432 | #ifndef QT_NO_DATASTREAM |
3433 | /*! \relates QUrl |
3434 | |
3435 | Writes url \a url to the stream \a out and returns a reference |
3436 | to the stream. |
3437 | |
3438 | \sa{Serializing Qt Data Types}{Format of the QDataStream operators} |
3439 | */ |
3440 | QDataStream &operator<<(QDataStream &out, const QUrl &url) |
3441 | { |
3442 | QByteArray u; |
3443 | if (url.isValid()) |
3444 | u = url.toEncoded(); |
3445 | out << u; |
3446 | return out; |
3447 | } |
3448 | |
3449 | /*! \relates QUrl |
3450 | |
3451 | Reads a url into \a url from the stream \a in and returns a |
3452 | reference to the stream. |
3453 | |
3454 | \sa{Serializing Qt Data Types}{Format of the QDataStream operators} |
3455 | */ |
3456 | QDataStream &operator>>(QDataStream &in, QUrl &url) |
3457 | { |
3458 | QByteArray u; |
3459 | in >> u; |
3460 | url.setUrl(QString::fromLatin1(u)); |
3461 | return in; |
3462 | } |
3463 | #endif // QT_NO_DATASTREAM |
3464 | |
3465 | #ifndef QT_NO_DEBUG_STREAM |
3466 | QDebug operator<<(QDebug d, const QUrl &url) |
3467 | { |
3468 | QDebugStateSaver saver(d); |
3469 | d.nospace() << "QUrl(" << url.toDisplayString() << ')'; |
3470 | return d; |
3471 | } |
3472 | #endif |
3473 | |
3474 | static QString errorMessage(QUrlPrivate::ErrorCode errorCode, const QString &errorSource, int errorPosition) |
3475 | { |
3476 | QChar c = uint(errorPosition) < uint(errorSource.length()) ? |
3477 | errorSource.at(errorPosition) : QChar(QChar::Null); |
3478 | |
3479 | switch (errorCode) { |
3480 | case QUrlPrivate::NoError: |
3481 | Q_ASSERT_X(false, "QUrl::errorString" , |
3482 | "Impossible: QUrl::errorString should have treated this condition" ); |
3483 | Q_UNREACHABLE(); |
3484 | return QString(); |
3485 | |
3486 | case QUrlPrivate::InvalidSchemeError: { |
3487 | auto msg = QLatin1String("Invalid scheme (character '%1' not permitted)" ); |
3488 | return msg.arg(c); |
3489 | } |
3490 | |
3491 | case QUrlPrivate::InvalidUserNameError: |
3492 | return QLatin1String("Invalid user name (character '%1' not permitted)" ) |
3493 | .arg(c); |
3494 | |
3495 | case QUrlPrivate::InvalidPasswordError: |
3496 | return QLatin1String("Invalid password (character '%1' not permitted)" ) |
3497 | .arg(c); |
3498 | |
3499 | case QUrlPrivate::InvalidRegNameError: |
3500 | if (errorPosition != -1) |
3501 | return QLatin1String("Invalid hostname (character '%1' not permitted)" ) |
3502 | .arg(c); |
3503 | else |
3504 | return QStringLiteral("Invalid hostname (contains invalid characters)" ); |
3505 | case QUrlPrivate::InvalidIPv4AddressError: |
3506 | return QString(); // doesn't happen yet |
3507 | case QUrlPrivate::InvalidIPv6AddressError: |
3508 | return QStringLiteral("Invalid IPv6 address" ); |
3509 | case QUrlPrivate::InvalidCharacterInIPv6Error: |
3510 | return QLatin1String("Invalid IPv6 address (character '%1' not permitted)" ).arg(c); |
3511 | case QUrlPrivate::InvalidIPvFutureError: |
3512 | return QLatin1String("Invalid IPvFuture address (character '%1' not permitted)" ).arg(c); |
3513 | case QUrlPrivate::HostMissingEndBracket: |
3514 | return QStringLiteral("Expected ']' to match '[' in hostname" ); |
3515 | |
3516 | case QUrlPrivate::InvalidPortError: |
3517 | return QStringLiteral("Invalid port or port number out of range" ); |
3518 | case QUrlPrivate::PortEmptyError: |
3519 | return QStringLiteral("Port field was empty" ); |
3520 | |
3521 | case QUrlPrivate::InvalidPathError: |
3522 | return QLatin1String("Invalid path (character '%1' not permitted)" ) |
3523 | .arg(c); |
3524 | |
3525 | case QUrlPrivate::InvalidQueryError: |
3526 | return QLatin1String("Invalid query (character '%1' not permitted)" ) |
3527 | .arg(c); |
3528 | |
3529 | case QUrlPrivate::InvalidFragmentError: |
3530 | return QLatin1String("Invalid fragment (character '%1' not permitted)" ) |
3531 | .arg(c); |
3532 | |
3533 | case QUrlPrivate::AuthorityPresentAndPathIsRelative: |
3534 | return QStringLiteral("Path component is relative and authority is present" ); |
3535 | case QUrlPrivate::AuthorityAbsentAndPathIsDoubleSlash: |
3536 | return QStringLiteral("Path component starts with '//' and authority is absent" ); |
3537 | case QUrlPrivate::RelativeUrlPathContainsColonBeforeSlash: |
3538 | return QStringLiteral("Relative URL's path component contains ':' before any '/'" ); |
3539 | } |
3540 | |
3541 | Q_ASSERT_X(false, "QUrl::errorString" , "Cannot happen, unknown error" ); |
3542 | Q_UNREACHABLE(); |
3543 | return QString(); |
3544 | } |
3545 | |
3546 | static inline void appendComponentIfPresent(QString &msg, bool present, const char *componentName, |
3547 | const QString &component) |
3548 | { |
3549 | if (present) { |
3550 | msg += QLatin1String(componentName); |
3551 | msg += QLatin1Char('"'); |
3552 | msg += component; |
3553 | msg += QLatin1String("\"," ); |
3554 | } |
3555 | } |
3556 | |
3557 | /*! |
3558 | \since 4.2 |
3559 | |
3560 | Returns an error message if the last operation that modified this QUrl |
3561 | object ran into a parsing error. If no error was detected, this function |
3562 | returns an empty string and isValid() returns \c true. |
3563 | |
3564 | The error message returned by this function is technical in nature and may |
3565 | not be understood by end users. It is mostly useful to developers trying to |
3566 | understand why QUrl will not accept some input. |
3567 | |
3568 | \sa QUrl::ParsingMode |
3569 | */ |
3570 | QString QUrl::errorString() const |
3571 | { |
3572 | QString msg; |
3573 | if (!d) |
3574 | return msg; |
3575 | |
3576 | QString errorSource; |
3577 | int errorPosition = 0; |
3578 | QUrlPrivate::ErrorCode errorCode = d->validityError(&errorSource, &errorPosition); |
3579 | if (errorCode == QUrlPrivate::NoError) |
3580 | return msg; |
3581 | |
3582 | msg += errorMessage(errorCode, errorSource, errorPosition); |
3583 | msg += QLatin1String("; source was \"" ); |
3584 | msg += errorSource; |
3585 | msg += QLatin1String("\";" ); |
3586 | appendComponentIfPresent(msg, d->sectionIsPresent & QUrlPrivate::Scheme, |
3587 | " scheme = " , d->scheme); |
3588 | appendComponentIfPresent(msg, d->sectionIsPresent & QUrlPrivate::UserInfo, |
3589 | " userinfo = " , userInfo()); |
3590 | appendComponentIfPresent(msg, d->sectionIsPresent & QUrlPrivate::Host, |
3591 | " host = " , d->host); |
3592 | appendComponentIfPresent(msg, d->port != -1, |
3593 | " port = " , QString::number(d->port)); |
3594 | appendComponentIfPresent(msg, !d->path.isEmpty(), |
3595 | " path = " , d->path); |
3596 | appendComponentIfPresent(msg, d->sectionIsPresent & QUrlPrivate::Query, |
3597 | " query = " , d->query); |
3598 | appendComponentIfPresent(msg, d->sectionIsPresent & QUrlPrivate::Fragment, |
3599 | " fragment = " , d->fragment); |
3600 | if (msg.endsWith(QLatin1Char(','))) |
3601 | msg.chop(1); |
3602 | return msg; |
3603 | } |
3604 | |
3605 | /*! |
3606 | \since 5.1 |
3607 | |
3608 | Converts a list of \a urls into a list of QString objects, using toString(\a options). |
3609 | */ |
3610 | QStringList QUrl::toStringList(const QList<QUrl> &urls, FormattingOptions options) |
3611 | { |
3612 | QStringList lst; |
3613 | lst.reserve(urls.size()); |
3614 | for (const QUrl &url : urls) |
3615 | lst.append(url.toString(options)); |
3616 | return lst; |
3617 | |
3618 | } |
3619 | |
3620 | /*! |
3621 | \since 5.1 |
3622 | |
3623 | Converts a list of strings representing \a urls into a list of urls, using QUrl(str, \a mode). |
3624 | Note that this means all strings must be urls, not for instance local paths. |
3625 | */ |
3626 | QList<QUrl> QUrl::fromStringList(const QStringList &urls, ParsingMode mode) |
3627 | { |
3628 | QList<QUrl> lst; |
3629 | lst.reserve(urls.size()); |
3630 | for (const QString &str : urls) |
3631 | lst.append(QUrl(str, mode)); |
3632 | return lst; |
3633 | } |
3634 | |
3635 | /*! |
3636 | \typedef QUrl::DataPtr |
3637 | \internal |
3638 | */ |
3639 | |
3640 | /*! |
3641 | \fn DataPtr &QUrl::data_ptr() |
3642 | \internal |
3643 | */ |
3644 | |
3645 | /*! |
3646 | Returns the hash value for the \a url. If specified, \a seed is used to |
3647 | initialize the hash. |
3648 | |
3649 | \relates QHash |
3650 | \since 5.0 |
3651 | */ |
3652 | size_t qHash(const QUrl &url, size_t seed) noexcept |
3653 | { |
3654 | if (!url.d) |
3655 | return qHash(-1, seed); // the hash of an unset port (-1) |
3656 | |
3657 | return qHash(url.d->scheme) ^ |
3658 | qHash(url.d->userName) ^ |
3659 | qHash(url.d->password) ^ |
3660 | qHash(url.d->host) ^ |
3661 | qHash(url.d->port, seed) ^ |
3662 | qHash(url.d->path) ^ |
3663 | qHash(url.d->query) ^ |
3664 | qHash(url.d->fragment); |
3665 | } |
3666 | |
3667 | static QUrl adjustFtpPath(QUrl url) |
3668 | { |
3669 | if (url.scheme() == ftpScheme()) { |
3670 | QString path = url.path(QUrl::PrettyDecoded); |
3671 | if (path.startsWith(QLatin1String("//" ))) |
3672 | url.setPath(QLatin1String("/%2F" ) + QStringView{path}.mid(2), QUrl::TolerantMode); |
3673 | } |
3674 | return url; |
3675 | } |
3676 | |
3677 | static bool isIp6(const QString &text) |
3678 | { |
3679 | QIPAddressUtils::IPv6Address address; |
3680 | return !text.isEmpty() && QIPAddressUtils::parseIp6(address, text.begin(), text.end()) == nullptr; |
3681 | } |
3682 | |
3683 | /*! |
3684 | Returns a valid URL from a user supplied \a userInput string if one can be |
3685 | deduced. In the case that is not possible, an invalid QUrl() is returned. |
3686 | |
3687 | This allows the user to input a URL or a local file path in the form of a plain |
3688 | string. This string can be manually typed into a location bar, obtained from |
3689 | the clipboard, or passed in via command line arguments. |
3690 | |
3691 | When the string is not already a valid URL, a best guess is performed, |
3692 | making various assumptions. |
3693 | |
3694 | In the case the string corresponds to a valid file path on the system, |
3695 | a file:// URL is constructed, using QUrl::fromLocalFile(). |
3696 | |
3697 | If that is not the case, an attempt is made to turn the string into a |
3698 | http:// or ftp:// URL. The latter in the case the string starts with |
3699 | 'ftp'. The result is then passed through QUrl's tolerant parser, and |
3700 | in the case or success, a valid QUrl is returned, or else a QUrl(). |
3701 | |
3702 | \section1 Examples: |
3703 | |
3704 | \list |
3705 | \li qt-project.org becomes http://qt-project.org |
3706 | \li ftp.qt-project.org becomes ftp://ftp.qt-project.org |
3707 | \li hostname becomes http://hostname |
3708 | \li /home/user/test.html becomes file:///home/user/test.html |
3709 | \endlist |
3710 | |
3711 | In order to be able to handle relative paths, this method takes an optional |
3712 | \a workingDirectory path. This is especially useful when handling command |
3713 | line arguments. |
3714 | If \a workingDirectory is empty, no handling of relative paths will be done. |
3715 | |
3716 | By default, an input string that looks like a relative path will only be treated |
3717 | as such if the file actually exists in the given working directory. |
3718 | If the application can handle files that don't exist yet, it should pass the |
3719 | flag AssumeLocalFile in \a options. |
3720 | |
3721 | \since 5.4 |
3722 | */ |
3723 | QUrl QUrl::fromUserInput(const QString &userInput, const QString &workingDirectory, |
3724 | UserInputResolutionOptions options) |
3725 | { |
3726 | QString trimmedString = userInput.trimmed(); |
3727 | |
3728 | if (trimmedString.isEmpty()) |
3729 | return QUrl(); |
3730 | |
3731 | // Check for IPv6 addresses, since a path starting with ":" is absolute (a resource) |
3732 | // and IPv6 addresses can start with "c:" too |
3733 | if (isIp6(trimmedString)) { |
3734 | QUrl url; |
3735 | url.setHost(trimmedString); |
3736 | url.setScheme(QStringLiteral("http" )); |
3737 | return url; |
3738 | } |
3739 | |
3740 | const QUrl url = QUrl(trimmedString, QUrl::TolerantMode); |
3741 | |
3742 | // Check for a relative path |
3743 | if (!workingDirectory.isEmpty()) { |
3744 | const QFileInfo fileInfo(QDir(workingDirectory), userInput); |
3745 | if (fileInfo.exists()) |
3746 | return QUrl::fromLocalFile(fileInfo.absoluteFilePath()); |
3747 | |
3748 | // Check both QUrl::isRelative (to detect full URLs) and QDir::isAbsolutePath (since on Windows drive letters can be interpreted as schemes) |
3749 | if ((options & AssumeLocalFile) && url.isRelative() && !QDir::isAbsolutePath(userInput)) |
3750 | return QUrl::fromLocalFile(fileInfo.absoluteFilePath()); |
3751 | } |
3752 | |
3753 | // Check first for files, since on Windows drive letters can be interpretted as schemes |
3754 | if (QDir::isAbsolutePath(trimmedString)) |
3755 | return QUrl::fromLocalFile(trimmedString); |
3756 | |
3757 | QUrl urlPrepended = QUrl(QLatin1String("http://" ) + trimmedString, QUrl::TolerantMode); |
3758 | |
3759 | // Check the most common case of a valid url with a scheme |
3760 | // We check if the port would be valid by adding the scheme to handle the case host:port |
3761 | // where the host would be interpretted as the scheme |
3762 | if (url.isValid() |
3763 | && !url.scheme().isEmpty() |
3764 | && urlPrepended.port() == -1) |
3765 | return adjustFtpPath(url); |
3766 | |
3767 | // Else, try the prepended one and adjust the scheme from the host name |
3768 | if (urlPrepended.isValid() && (!urlPrepended.host().isEmpty() || !urlPrepended.path().isEmpty())) { |
3769 | int dotIndex = trimmedString.indexOf(QLatin1Char('.')); |
3770 | const QStringView hostscheme = QStringView{trimmedString}.left(dotIndex); |
3771 | if (hostscheme.compare(ftpScheme(), Qt::CaseInsensitive) == 0) |
3772 | urlPrepended.setScheme(ftpScheme()); |
3773 | return adjustFtpPath(urlPrepended); |
3774 | } |
3775 | |
3776 | return QUrl(); |
3777 | } |
3778 | |
3779 | QT_END_NAMESPACE |
3780 | |