pkg-download: check hashes for locally cached files
authorYann E. MORIN <yann.morin.1998@free.fr>
Thu, 11 Dec 2014 22:52:08 +0000 (23:52 +0100)
committerThomas Petazzoni <thomas.petazzoni@free-electrons.com>
Thu, 11 Dec 2014 22:59:41 +0000 (23:59 +0100)
In some cases, upstream just update their releases in-place, without
renaming them. When that package is updated in Buildroot, a new hash to
match the new upstream release is included in the corresponding .hash
file.

As a consequence, users who previously downloaded that package's tarball
with an older version of Buildroot, will get stuck with an old archive
for that package, and after updating their Buildroot copy, will be greeted
with a failed download, due to the local file not matching the new
hashes.

Also, an upstream would sometime serve us HTML garbage instead of the
actual tarball we requested, like SourceForge does from time for as-yet
unknown reasons.

So, to avoid this situation, check the hashes prior to doing the
download. If the hashes match, consider the locally cached file genuine,
and do not download it. However, if the locally cached file does not
match the known hashes we have for it, it is promptly removed, and a
download is re-attempted.

Note: this does not add any overhead compared to the previous situation,
because we were already checking hashes of locally cached files. It just
changes the order in which we do the checks. For the records, here is the
overhead of hashing a 231MiB file (qt-everywhere-opensource-src-4.8.6.tar.gz)
on a core-i5 @2.5GHz:

            cache-cold  cache-hot
    sha1      1.914s      0.762s
    sha256    2.109s      1.270s

But again, this overhead already existed before this patch.

Signed-off-by: "Yann E. MORIN" <yann.morin.1998@free.fr>
Cc: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Cc: Peter Korsgaard <jacmet@uclibc.org>
Cc: Gustavo Zacarias <gustavo@zacarias.com.ar>
Reviewed-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
support/download/dl-wrapper

index f0cdd735b4147eaa9089ec0af9f572c7d5baaf49..cced8f6a4c7cbe08317cc860cb1b4243c0413e33 100755 (executable)
@@ -49,7 +49,11 @@ main() {
 
     # If the output file already exists, do not download it again
     if [ -e "${output}" ]; then
-        exit 0
+        if support/download/check-hash "${hfile}" "${output}" "${output##*/}"; then
+            exit 0
+        fi
+        rm -f "${output}"
+        printf "Re-downloading '%s'...\n" "${output##*/}"
     fi
 
     # tmpd is a temporary directory in which backends may store intermediate