Опубликовано 16.11.2022
При скачивании файлов wget, у него есть одна особенность работы, файл, который нам необходимо скачать, создается перед загрузкой. Другими словами, если сервер, с которого мы скачиваем, вернет ошибку, файл будет испорчен. Данная проблема актуальна в Bash скриптах.
Для решение это проблемы можно использовать механизм использования временных файлов.
Скачиваем файл во временную директорию
wget "http://example.org/test.dat" -O "/tmp/test.tmp"
Проверяем коды ошибок (если все ОК, должен вернуться 0)
if [ "$?" -eq 0 ]; then
#Действие с файлами
fi
Если необходимо вернуть сообщение об ошибке
if [ "$?" -ne 0 ]; then
rm "/tmp/test.tmpt"
echo "Data download error!"
exit
fi
Для примера работы, ниже приведен пример функции скачивания файла с использованием wget
download(){
local tmp=$(date | md5sum | head -c 20);
local USERAGENT="Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:106.0) Gecko/20100101 Firefox/106.0"
wget --user-agent="$USERAGENT"
$1
--header="Accept: text/html"
--keep-session-cookies
--secure-protocol=TLSv1_3
--header='Accept-Language: en-us'
-O /tmp/$tmp
if [ "$?" -eq 0 ]; then
cp /tmp/$tmp "$2"
rm -f /tmp/$tmp
else
rm -f /tmp/$tmp
fi
}
Пример использования
download "http://example.org/test.dat" "/var/test.dat"
Файл /var/test.dat будет обновлен(создан) только в случае успешной загрузки
I’m writing a script to download a bunch of files, and I want it to inform when a particular file doesn’t exist.
r=`wget -q www.someurl.com`
if [ $r -ne 0 ]
then echo "Not there"
else echo "OK"
fi
But it gives the following error on execution:
./file: line 2: [: -ne: unary operator expected
What’s wrong?
zx8754
51.6k12 gold badges113 silver badges204 bronze badges
asked Apr 26, 2010 at 22:16
Others have correctly posted that you can use $?
to get the most recent exit code:
wget_output=$(wget -q "$URL")
if [ $? -ne 0 ]; then
...
This lets you capture both the stdout and the exit code. If you don’t actually care what it prints, you can just test it directly:
if wget -q "$URL"; then
...
And if you want to suppress the output:
if wget -q "$URL" > /dev/null; then
...
answered Apr 26, 2010 at 22:32
CascabelCascabel
475k70 gold badges369 silver badges318 bronze badges
1
$r
is the text output of wget (which you’ve captured with backticks). To access the return code, use the $?
variable.
answered Apr 26, 2010 at 22:25
nobodynobody
19.8k17 gold badges56 silver badges77 bronze badges
1
$r
is empty, and therefore your condition becomes if [ -ne 0 ]
and it seems as if -ne
is used as a unary operator. Try this instead:
wget -q www.someurl.com
if [ $? -ne 0 ]
...
EDIT As Andrew explained before me, backticks return standard output, while $?
returns the exit code of the last operation.
answered Apr 26, 2010 at 22:26
BoloBolo
11.5k7 gold badges41 silver badges60 bronze badges
1
you could just
wget ruffingthewitness.com && echo "WE GOT IT" || echo "Failure"
-(~)----------------------------------------------------------(07:30 Tue Apr 27)
risk@DockMaster [2024] --> wget ruffingthewitness.com && echo "WE GOT IT" || echo "Failure"
--2010-04-27 07:30:56-- http://ruffingthewitness.com/
Resolving ruffingthewitness.com... 69.56.251.239
Connecting to ruffingthewitness.com|69.56.251.239|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: unspecified [text/html]
Saving to: `index.html.1'
[ <=> ] 14,252 72.7K/s in 0.2s
2010-04-27 07:30:58 (72.7 KB/s) - `index.html.1' saved [14252]
WE GOT IT
-(~)-----------------------------------------------------------------------------------------------------------(07:30 Tue Apr 27)
risk@DockMaster [2025] --> wget ruffingthewitness.biz && echo "WE GOT IT" || echo "Failure"
--2010-04-27 07:31:05-- http://ruffingthewitness.biz/
Resolving ruffingthewitness.biz... failed: Name or service not known.
wget: unable to resolve host address `ruffingthewitness.biz'
zsh: exit 1 wget ruffingthewitness.biz
Failure
-(~)-----------------------------------------------------------------------------------------------------------(07:31 Tue Apr 27)
risk@DockMaster [2026] -->
answered Apr 27, 2010 at 12:31
kSiRkSiR
7644 silver badges9 bronze badges
2
Best way to capture the result from wget and also check the call status
wget -O filename URL
if [[ $? -ne 0 ]]; then
echo "wget failed"
exit 1;
fi
This way you can check the status of wget as well as store the output data.
-
If call is successful use the output stored
-
Otherwise it will exit with the error wget failed
answered Aug 24, 2016 at 11:49
I been trying all the solutions without lucky.
wget executes in non-interactive way. This means that wget work in the background and you can’t catch the return code with $?.
One solution it’s to handle the «—server-response» property, searching http 200 status code
Example:
wget --server-response -q -o wgetOut http://www.someurl.com
sleep 5
_wgetHttpCode=`cat wgetOut | gawk '/HTTP/{ print $2 }'`
if [ "$_wgetHttpCode" != "200" ]; then
echo "[Error] `cat wgetOut`"
fi
Note: wget need some time to finish his work, for that reason I put «sleep 5». This is not the best way to do but worked ok for test the solution.
BaCaRoZzo
7,4526 gold badges50 silver badges81 bronze badges
answered Jan 21, 2015 at 18:22
Asked
12 years, 3 months ago
Viewed
29k times
wget
normally stops when it gets a HTTP error, e.g. 404 or so. Is there an option to make wget
to download the page content regardless of the HTTP code?
asked Mar 6, 2011 at 8:46
1
Parameter: --content-on-error
, available from wget 1.14:
If this is set to on, wget will not skip the content when the server responds with a http status code that indicates error.
answered Sep 21, 2013 at 15:59
NowakerNowaker
1,57513 silver badges13 bronze badges
5
You could try something like:
#!/bin/sh
[ -n $1 ] || {
echo "Usage: $0 [url to file to get]" >&2
exit 1
}
wget $1
[ $? ] && {
echo "Could not download $1" | mail -s "Uh Oh" you@yourdomain.com
echo "Aww snap ..." >&2
exit 1
}
# If we're here, it downloaded successfully, and will exit with a normal status
When making a script that will (likely) be called by other scripts, it is important to do the following:
- Ensure argument sanity
- Send e-mail, write to a log, or do something else so someone knows what went wrong
The >&2
simply redirects the output of error messages to stderror
, which allows a calling script to do something like this:
foo-downloader >/dev/null 2>/some/log/file.txt
Since it is a short wrapper, no reason to forsake a bit of sanity
This also allows you to selectively direct the output of wget
to /dev/null
, you might actually want to see it when testing, especially if you get an e-mail saying it failed
I need to test my http server responses in various cases, even for failed authentication. In case of failed authentication, my server returns 401 Unauthorized
and also a response body which simple contains Unauthorized
(or maybe some other detailed message).
Using e.g. curl
or httpie
, I got those response body in case of 401
response.
$ curl http://10.5.1.1/bla
Unauthorized
$ curl http://10.5.1.1/bla --digest --user joe:wrong
Unauthorized
$ http http://10.5.1.1/bla -b
Unauthorized
$ http http://10.5.1.1/bla -b --auth-type digest --auth joe:wrong
Unauthorized
But when trying this using wget, I got no output:
$ wget http://10.5.1.1/bla -q -O /dev/stdout
$ wget http://10.5.1.1/bla -q -O /dev/stdout --user joe --password wrong
wget returns with exitcode 6 in this case, but I need to check the response message.
Here is a dump of the complete traffic, captured using httpie:
$ http http://10.5.1.1/bla --print hbHB
GET /bla HTTP/1.1
Accept: */*
Accept-Encoding: gzip, deflate
Connection: keep-alive
Host: 10.5.1.1
User-Agent: HTTPie/0.9.2
HTTP/1.1 401 Unauthorized
Connection: keep-alive
Content-Length: 13
Content-Type: text/plain
Date: 2017-08-27 23:01:07
Server: Vistra-I St10 SW218 HW010
WWW-Authenticate: Digest realm="Vistra-I", qop="auth", nonce="a0c5d461f2b74b2b797b62f54200d125", opaque="0123456789abcdef0123456789abcdef"
Unauthorized
(Note that the Unauthorized
message in the response body ends with a new line character, that’s why Content-Length: 13)
Same thing, if the server respond with 403
or 404
et cetera.
Any idea how to get the response body using wget
in this case?
Edit 2017-09-22
Found --content-on-error
option in my wget 1.17.1 (see also wget manual).
This works in case of e.g. response code 404
but not for 401
nor 5xx
codes.
For 401
see this bug.