Wget обработка ошибок

Опубликовано 16.11.2022

При скачивании файлов wget, у него есть одна особенность работы, файл, который нам необходимо скачать, создается перед загрузкой. Другими словами, если сервер, с которого мы скачиваем, вернет ошибку, файл будет испорчен. Данная проблема актуальна в Bash скриптах.

linux

Для решение это проблемы можно использовать механизм использования временных файлов.

Скачиваем файл во временную директорию

wget  "http://example.org/test.dat" -O "/tmp/test.tmp"

Проверяем коды ошибок (если все ОК, должен вернуться 0)

if [ "$?" -eq 0 ]; then
#Действие с файлами
fi

Если необходимо вернуть сообщение об ошибке

if [ "$?" -ne 0 ]; then
rm "/tmp/test.tmpt"
echo "Data download error!"
exit
fi

Для примера работы, ниже приведен пример функции скачивания файла с использованием wget

download(){
local tmp=$(date | md5sum | head -c 20);
local USERAGENT="Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:106.0) Gecko/20100101 Firefox/106.0"
wget --user-agent="$USERAGENT" 
$1 
--header="Accept: text/html" 
--keep-session-cookies 
--secure-protocol=TLSv1_3 
--header='Accept-Language: en-us' 
-O /tmp/$tmp


if [ "$?" -eq 0 ]; then
cp /tmp/$tmp "$2"
rm -f /tmp/$tmp
else
rm -f /tmp/$tmp
fi
}

Пример использования

download "http://example.org/test.dat" "/var/test.dat"

Файл /var/test.dat будет обновлен(создан) только в случае успешной загрузки

I’m writing a script to download a bunch of files, and I want it to inform when a particular file doesn’t exist.

r=`wget -q www.someurl.com`
if [ $r -ne 0 ]
  then echo "Not there"
  else echo "OK"
fi

But it gives the following error on execution:

./file: line 2: [: -ne: unary operator expected

What’s wrong?

zx8754's user avatar

zx8754

51.6k12 gold badges113 silver badges204 bronze badges

asked Apr 26, 2010 at 22:16

Igor's user avatar

Others have correctly posted that you can use $? to get the most recent exit code:

wget_output=$(wget -q "$URL")
if [ $? -ne 0 ]; then
    ...

This lets you capture both the stdout and the exit code. If you don’t actually care what it prints, you can just test it directly:

if wget -q "$URL"; then
    ...

And if you want to suppress the output:

if wget -q "$URL" > /dev/null; then
    ...

answered Apr 26, 2010 at 22:32

Cascabel's user avatar

CascabelCascabel

475k70 gold badges369 silver badges318 bronze badges

1

$r is the text output of wget (which you’ve captured with backticks). To access the return code, use the $? variable.

answered Apr 26, 2010 at 22:25

nobody's user avatar

nobodynobody

19.8k17 gold badges56 silver badges77 bronze badges

1

$r is empty, and therefore your condition becomes if [ -ne 0 ] and it seems as if -ne is used as a unary operator. Try this instead:

wget -q www.someurl.com
if [ $? -ne 0 ]
  ...

EDIT As Andrew explained before me, backticks return standard output, while $? returns the exit code of the last operation.

answered Apr 26, 2010 at 22:26

Bolo's user avatar

BoloBolo

11.5k7 gold badges41 silver badges60 bronze badges

1

you could just

wget ruffingthewitness.com && echo "WE GOT IT" || echo "Failure"

-(~)----------------------------------------------------------(07:30 Tue Apr 27)
risk@DockMaster [2024] --> wget ruffingthewitness.com && echo "WE GOT IT" || echo "Failure" 
--2010-04-27 07:30:56--  http://ruffingthewitness.com/
Resolving ruffingthewitness.com... 69.56.251.239
Connecting to ruffingthewitness.com|69.56.251.239|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: unspecified [text/html]
Saving to: `index.html.1'

    [ <=>                                                                                    ] 14,252      72.7K/s   in 0.2s    

2010-04-27 07:30:58 (72.7 KB/s) - `index.html.1' saved [14252]

WE GOT IT
-(~)-----------------------------------------------------------------------------------------------------------(07:30 Tue Apr 27)
risk@DockMaster [2025] --> wget ruffingthewitness.biz && echo "WE GOT IT" || echo "Failure"
--2010-04-27 07:31:05--  http://ruffingthewitness.biz/
Resolving ruffingthewitness.biz... failed: Name or service not known.
wget: unable to resolve host address `ruffingthewitness.biz'
zsh: exit 1     wget ruffingthewitness.biz
Failure
-(~)-----------------------------------------------------------------------------------------------------------(07:31 Tue Apr 27)
risk@DockMaster [2026] --> 

answered Apr 27, 2010 at 12:31

kSiR's user avatar

kSiRkSiR

7644 silver badges9 bronze badges

2

Best way to capture the result from wget and also check the call status

wget -O filename URL
if [[ $? -ne 0 ]]; then
    echo "wget failed"
    exit 1; 
fi

This way you can check the status of wget as well as store the output data.

  1. If call is successful use the output stored

  2. Otherwise it will exit with the error wget failed

answered Aug 24, 2016 at 11:49

Anshul Sharma's user avatar

I been trying all the solutions without lucky.

wget executes in non-interactive way. This means that wget work in the background and you can’t catch the return code with $?.

One solution it’s to handle the «—server-response» property, searching http 200 status code
Example:

wget --server-response -q -o wgetOut http://www.someurl.com
sleep 5
_wgetHttpCode=`cat wgetOut | gawk '/HTTP/{ print $2 }'`
if [ "$_wgetHttpCode" != "200" ]; then
    echo "[Error] `cat wgetOut`"
fi

Note: wget need some time to finish his work, for that reason I put «sleep 5». This is not the best way to do but worked ok for test the solution.

BaCaRoZzo's user avatar

BaCaRoZzo

7,4526 gold badges50 silver badges81 bronze badges

answered Jan 21, 2015 at 18:22

kazzikazzi's user avatar

Asked
12 years, 3 months ago

Viewed
29k times

wget normally stops when it gets a HTTP error, e.g. 404 or so. Is there an option to make wget to download the page content regardless of the HTTP code?

asked Mar 6, 2011 at 8:46

lilydjwg's user avatar

1

Parameter: --content-on-error, available from wget 1.14:

If this is set to on, wget will not skip the content when the server responds with a http status code that indicates error.

Paŭlo Ebermann's user avatar

answered Sep 21, 2013 at 15:59

Nowaker's user avatar

NowakerNowaker

1,57513 silver badges13 bronze badges

5

You could try something like:

#!/bin/sh

[ -n $1 ] || {
    echo "Usage: $0 [url to file to get]" >&2
    exit 1
}

wget $1

[ $? ] && {
  echo "Could not download $1" | mail -s "Uh Oh" you@yourdomain.com
  echo "Aww snap ..." >&2
  exit 1
}

# If we're here, it downloaded successfully, and will exit with a normal status

When making a script that will (likely) be called by other scripts, it is important to do the following:

  • Ensure argument sanity
  • Send e-mail, write to a log, or do something else so someone knows what went wrong

The >&2 simply redirects the output of error messages to stderror, which allows a calling script to do something like this:

foo-downloader >/dev/null 2>/some/log/file.txt

Since it is a short wrapper, no reason to forsake a bit of sanity :)

This also allows you to selectively direct the output of wget to /dev/null, you might actually want to see it when testing, especially if you get an e-mail saying it failed :)

I need to test my http server responses in various cases, even for failed authentication. In case of failed authentication, my server returns 401 Unauthorized and also a response body which simple contains Unauthorized (or maybe some other detailed message).

Using e.g. curl or httpie, I got those response body in case of 401 response.

$ curl http://10.5.1.1/bla 
Unauthorized
$ curl http://10.5.1.1/bla --digest --user joe:wrong 
Unauthorized
$ http http://10.5.1.1/bla -b
Unauthorized
$ http http://10.5.1.1/bla -b --auth-type digest --auth joe:wrong
Unauthorized

But when trying this using wget, I got no output:

$ wget http://10.5.1.1/bla -q -O /dev/stdout
$ wget http://10.5.1.1/bla -q -O /dev/stdout --user joe --password wrong

wget returns with exitcode 6 in this case, but I need to check the response message.

Here is a dump of the complete traffic, captured using httpie:

$ http http://10.5.1.1/bla --print hbHB
GET /bla HTTP/1.1
Accept: */*
Accept-Encoding: gzip, deflate
Connection: keep-alive
Host: 10.5.1.1
User-Agent: HTTPie/0.9.2

HTTP/1.1 401 Unauthorized
Connection: keep-alive
Content-Length: 13
Content-Type: text/plain
Date: 2017-08-27 23:01:07
Server: Vistra-I St10 SW218 HW010
WWW-Authenticate: Digest realm="Vistra-I", qop="auth", nonce="a0c5d461f2b74b2b797b62f54200d125", opaque="0123456789abcdef0123456789abcdef"

Unauthorized

(Note that the Unauthorized message in the response body ends with a new line character, that’s why Content-Length: 13)

Same thing, if the server respond with 403 or 404 et cetera.
Any idea how to get the response body using wget in this case?

Edit 2017-09-22

Found --content-on-error option in my wget 1.17.1 (see also wget manual).

This works in case of e.g. response code 404 but not for 401 nor 5xx codes.

For 401 see this bug.

Понравилась статья? Поделить с друзьями:
  • Wewb32 exe ошибка 0xc0000142 электроник воркбенч
  • Wfplwfs sys ошибка
  • Western union коды ошибок
  • Wfp ipsec diagnostics со следующей ошибкой 0xc0000022
  • Western digital ошибка диска