非常规手段上传下载二进制文件 Secer's Blog - 记录互联网安全历程与个人成长经历

作者：scz@绿盟科技
来源：绿盟科技博客

文中演示了3种数据映射方案，有更多其他编解码方案，这3种够用了。前面介绍的都是bin与txt的相互转换，各种编码、解码。假设数据传输通道只有一个弱shell，有回显，可以通过copy/paste无损传输可打印字符。为了将不可打印字节传输过去，只能通过编解码进行数据映射。

od+xxd

2000年时我和tt在一台远程主机上想把其中一个ELF弄回本地来逆向工程，目标只在23/TCP上开了服务，其他端口不可达。远程主机上可用命令少得可怜，xxd、base64、uuencode之类的都没有，但意外发现有个od。后来靠od把这个ELF从远程弄回了本地。

为了便于演示说明，生造一个二进制文件:

$ printf -v escseq \\%o {0..255}
$ printf "$escseq" > some

这是bash语法，ash不支持。

$ xxd -g 1 some
00000000: 00 01 02 03 04 05 06 07 08 09 0a 0b 0c 0d 0e 0f ................
00000010: 10 11 12 13 14 15 16 17 18 19 1a 1b 1c 1d 1e 1f ................
00000020: 20 21 22 23 24 25 26 27 28 29 2a 2b 2c 2d 2e 2f !"#$%&'()*+,-./
00000030: 30 31 32 33 34 35 36 37 38 39 3a 3b 3c 3d 3e 3f 0123456789:;?
00000040: 40 41 42 43 44 45 46 47 48 49 4a 4b 4c 4d 4e 4f @ABCDEFGHIJKLMNO
00000050: 50 51 52 53 54 55 56 57 58 59 5a 5b 5c 5d 5e 5f PQRSTUVWXYZ[\]^_
00000060: 60 61 62 63 64 65 66 67 68 69 6a 6b 6c 6d 6e 6f `abcdefghijklmno
00000070: 70 71 72 73 74 75 76 77 78 79 7a 7b 7c 7d 7e 7f pqrstuvwxyz{|}~.
00000080: 80 81 82 83 84 85 86 87 88 89 8a 8b 8c 8d 8e 8f ................
00000090: 90 91 92 93 94 95 96 97 98 99 9a 9b 9c 9d 9e 9f ................
000000a0: a0 a1 a2 a3 a4 a5 a6 a7 a8 a9 aa ab ac ad ae af ................
000000b0: b0 b1 b2 b3 b4 b5 b6 b7 b8 b9 ba bb bc bd be bf ................
000000c0: c0 c1 c2 c3 c4 c5 c6 c7 c8 c9 ca cb cc cd ce cf ................
000000d0: d0 d1 d2 d3 d4 d5 d6 d7 d8 d9 da db dc dd de df ................
000000e0: e0 e1 e2 e3 e4 e5 e6 e7 e8 e9 ea eb ec ed ee ef ................
000000f0: f0 f1 f2 f3 f4 f5 f6 f7 f8 f9 fa fb fc fd fe ff ................
$ xxd -p some
000102030405060708090a0b0c0d0e0f101112131415161718191a1b1c1d
1e1f202122232425262728292a2b2c2d2e2f303132333435363738393a3b
3c3d3e3f404142434445464748494a4b4c4d4e4f50515253545556575859
5a5b5c5d5e5f606162636465666768696a6b6c6d6e6f7071727374757677
78797a7b7c7d7e7f808182838485868788898a8b8c8d8e8f909192939495
969798999a9b9c9d9e9fa0a1a2a3a4a5a6a7a8a9aaabacadaeafb0b1b2b3
b4b5b6b7b8b9babbbcbdbebfc0c1c2c3c4c5c6c7c8c9cacbcccdcecfd0d1
d2d3d4d5d6d7d8d9dadbdcdddedfe0e1e2e3e4e5e6e7e8e9eaebecedeeef
f0f1f2f3f4f5f6f7f8f9fafbfcfdfeff

xxd在Linux上很常见，但在其他非Linux的*nix环境中，od可能更常见。

$ od -An -tx1 -v --width=30 some &> some.txt

some.txt形如:

00 01 02 03 04 05 06 07 08 09 0a 0b 0c 0d 0e 0f 10 11 12 13 14 15 16 17 18 19 1a 1b 1c 1d
1e 1f 20 21 22 23 24 25 26 27 28 29 2a 2b 2c 2d 2e 2f 30 31 32 33 34 35 36 37 38 39 3a 3b
3c 3d 3e 3f 40 41 42 43 44 45 46 47 48 49 4a 4b 4c 4d 4e 4f 50 51 52 53 54 55 56 57 58 59

在远程主机上显示some.txt，设法把其中的内容原封不动地弄回本地来，比如录屏、开启终端日志等等。然后在本地处理some.txt，恢复出some。

$ sed "s/ //g" some.txt &> some.tmp

如果远程主机上有sed，上面这步可以在远程主机进行，减少通过网络传输的文本数据量。

some.tmp内容形如:

000102030405060708090a0b0c0d0e0f101112131415161718191a1b1c1d
1e1f202122232425262728292a2b2c2d2e2f303132333435363738393a3b
3c3d3e3f404142434445464748494a4b4c4d4e4f50515253545556575859

some.tmp的格式就是“xxx -p”的输出格式。

$ xxd -r -p some.tmp some

od本身只有数据转储功能，没有数据恢复功能。上面用“xxd -r”恢复出binary。

有人Ctrl-U断在U-Boot中，进行hexdump，然后恢复binary，本质是一样的。

xxd

如果远程主机有xxd，整个过程类似。

1) 方案1

$ xxd -p some &> some.txt

some.txt形如:

000102030405060708090a0b0c0d0e0f101112131415161718191a1b1c1d
1e1f202122232425262728292a2b2c2d2e2f303132333435363738393a3b
3c3d3e3f404142434445464748494a4b4c4d4e4f50515253545556575859

xxd生成的some.txt已经是最精简形式，不需要sed再处理。

$ xxd -r -p some.txt some

2) 方案2

方案2演示xxd的其他参数，性价比不如方案1。

$ xxd -g 1 some some.txt

some.txt形如:

00000000: 00 01 02 03 04 05 06 07 08 09 0a 0b 0c 0d 0e 0f ................
00000010: 10 11 12 13 14 15 16 17 18 19 1a 1b 1c 1d 1e 1f ................
00000020: 20 21 22 23 24 25 26 27 28 29 2a 2b 2c 2d 2e 2f !"#$%&'()*+,-./

$ xxd -r -s 0 some.txt some

base64

https://en.wikipedia.org/wiki/Base64

原始数据

01 02 03

二进制表示

00000001 00000010 00000011

从8-bits一组变成6-bits一组

000000 010000 001000 000011

16进制表示

00 10 08 03

查表后转成:

A Q I D

上面是base64编码基本原理，没有考虑需要填充的情形。

如果远程主机可以对binary进行base64编码，就没什么好说的了。

$ base64 some > some.txt

some.txt形如:

AAECAwQFBgcICQoLDA0ODxAREhMUFRYXGBkaGxwdHh8gISIjJCUmJygpKissLS4vMDEyMzQ1Njc4
OTo7PD0+P0BBQkNERUZHSElKS0xNTk9QUVJTVFVWV1hZWltcXV5fYGFiY2RlZmdoaWprbG1ub3Bx
cnN0dXZ3eHl6e3x9fn+AgYKDhIWGh4iJiouMjY6PkJGSk5SVlpeYmZqbnJ2en6ChoqOkpaanqKmq
q6ytrq+wsbKztLW2t7i5uru8vb6/wMHCw8TFxsfIycrLzM3Oz9DR0tPU1dbX2Nna29zd3t/g4eLj
5OXm5+jp6uvs7e7v8PHy8/T19vf4+fr7/P3+/w==

$ base64 -d some.txt > some

本文假设针对*nix环境，不考虑vbscript、jscript这些存在。

base64编码比“xxd -p”省空间，前者一个字符代表6-bits，后者一个字符代表4-bits。

uuencode/uudecode

https://en.wikipedia.org/wiki/Uuencoding

begin

...

`
end

is a character indicating the number of data bytes which have been encoded on that line. This is an ASCII character determined by adding 32 to the actual byte count, with the sole exception of a grave accent “`” (ASCII code 96) signifying zero bytes. All data lines except the last (if the data was not divisible by 45), have 45 bytes of encoded data (60 characters after encoding). Therefore, the vast majority of length values is ‘M’, (32 + 45 = ASCII code 77 or “M”).

If the source is not divisible by 3 then the last 4-byte section will contain padding bytes to make it cleanly divisible. These bytes are subtracted from the line’s so that the decoder does not append unwanted characters to the file.

uu编码如今已不多见。

1) uu编码

$ uuencode some some > some.txt

some.txt形如:

begin 644 some
M``$"`P0%!@<("0H+#`T.#Q`1$A,4%187&!D:&QP='A\@(2(C)"4F)R@I*BLL
M+2XO,#$R,S0U-C<X.3H[/#T^/T!!0D-$149'2$E*2TQ-3D]045)35%565UA9
M6EM<75Y?8&%B8V1E9F=H:6IK;&UN;W!Q'EZ>WQ]?G^`@8*#A(6&
MAXB)BHN,C8Z/D)&2DY25EI>8F9J;G)V>GZ"AHJ.DI::GJ*FJJZRMKJ^PL;*S
MM+6VM[BYNKN\O;Z_P,'"P\3%QL?(R+CY.7FY^CIZNOL[>[O\/'R\_3U]O?X^?K[_/W^_P``
`
end

这是传统的uuencode编码

$ uudecode -o some some.txt

2) base64编码

某些uuencode命令支持base64

$ uuencode -m some some > some.txt

some.txt形如:

begin-base64 644 some
AAECAwQFBgcICQoLDA0ODxAREhMUFRYXGBkaGxwdHh8gISIjJCUmJygpKiss
LS4vMDEyMzQ1Njc4OTo7PD0+P0BBQkNERUZHSElKS0xNTk9QUVJTVFVWV1hZ
WltcXV5fYGFiY2RlZmdoaWprbG1ub3BxcnN0dXZ3eHl6e3x9fn+AgYKDhIWG
h4iJiouMjY6PkJGSk5SVlpeYmZqbnJ2en6ChoqOkpaanqKmqq6ytrq+wsbKz
tLW2t7i5uru8vb6/wMHCw8TFxsfIycrLzM3Oz9DR0tPU1dbX2Nna29zd3t/g
4eLj5OXm5+jp6uvs7e7v8PHy8/T19vf4+fr7/P3+/w==

====

$ uudecode -o some some.txt

解码时不需要额外参数，靠第一行识别base64编码。“uuencode -m”产生的内容相比base64产生的内容，多了第一行及最后一行:

begin-base64 644 some

====

把这两行删除后，就可以用”base64 -d”解码。

awk

我们并不只考虑从远程主机下载binary，也考虑向远程主机上传binary。

如果目标环境有gcc，就弄个C代码实现base64编解码。本文不考虑宽松环境，像perl、python、gcc之类的都不考虑。考虑目标环境存在awk。

1) base64decode.awk

https://github.com/shane-kerr/AWK-base64decode/blob/master/base64decode.awk

# base64decode.awk
#
# Introduction
# ============
# Decode Base64-encoded strings.
#
# Invocation
# ==========
# Typically you run the script like this:
#
#     $ awk -f base64decode.awk [file1 [file2 [...]]] > output

# The script implements Base64 decoding, based on RFC 3548:
#
# https://tools.ietf.org/html/rfc3548

# create our lookup table
BEGIN {
    # load symbols based on the alphabet
    for (i=0; i<26; i++) {
        BASE64[sprintf("%c", i+65)] = i
        BASE64[sprintf("%c", i+97)] = i+26
    }
    # load our numbers
    for (i=0; i= 4) {
        g0 = BASE64[substr(encoded, 1, 1)]
        g1 = BASE64[substr(encoded, 2, 1)]
        g2 = BASE64[substr(encoded, 3, 1)]
        g3 = BASE64[substr(encoded, 4, 1)]
        if (g0 == "") {
            printf("Unrecognized character %c in Base 64 encoded string\n",
                   g0) >> "/dev/stderr"
            exit 1
        }
        if (g1 == "") {
            printf("Unrecognized character %c in Base 64 encoded string\n",
                   g1) >> "/dev/stderr"
            exit 1
        }
        if (g2 == "") {
            printf("Unrecognized character %c in Base 64 encoded string\n",
                   g2) >> "/dev/stderr"
            exit 1
        }
        if (g3 == "") {
            printf("Unrecognized character %c in Base 64 encoded string\n",
                   g3) >> "/dev/stderr"
            exit 1
        }

        # we don't have bit shifting in AWK, but we can achieve the same
        # results with multiplication, division, and modulo arithmetic
        result[n++] = (g0 * 4) + int(g1 / 16)
        if (g2 != -1) {
            result[n++] = ((g1 * 16) % 256) + int(g2 / 4)
            if (g3 != -1) {
                result[n++] = ((g2 * 64) % 256) + g3
            }
        }

        encoded = substr(encoded, 5)
    }
    if (length(encoded) != 0) {
        printf("Extra characters at end of Base 64 encoded string: \"%s\"\n",
               encoded) >> "/dev/stderr"
        exit 1
    }
}

# our main text processing
{
    # Decode what we have read.
    base64decode($0, result)

    # Output the decoded string.
    #
    # We cannot output a NUL character using BusyBox AWK. See:
    # https://stackoverflow.com/a/32302711
    #
    # So we collect our result into an octal string and use the
    # shell "printf" command to create the actual output.
    #
    # This also helps with gawk, which gets confused about the
    # non-ASCII output if localization is used unless this is
    # set via LC_ALL=C or via "--characters-as-bytes".
    printf_str = ""
    for (i=1; i in result; i++) {
        printf_str = printf_str sprintf("\\%03o", result[i])
        if (length(printf_str) >= 1024) {
            system("printf '" printf_str "'")
            printf_str = ""
        }
        delete result[i]
    }
    system("printf '" printf_str "'")
}

$ base64 some > some.txt

$ awk -f base64decode.awk some.txt > some
$ busybox awk -f base64decode.awk some.txt > some

busybox不一定有nc，如果有awk就可以用前面这招。awk脚本执行效率很低，极端情况下聊胜于无。base64decode.awk在一个很弱的busybox环境下成功解码。

2) base64.awk

https://sites.google.com/site/dannychouinard/Home/unix-linux-trinkets/little-utilities/base64-and-base85-encoding-awk-scripts

Danny Chouinard的原实现在做base64编码时没有正确处理结尾的=，他固定添加“==”。

这个问题不大，原脚本产生的编码输出可以被原脚本有效解码，但用其他工具解码原脚本产生的编码输出时可能容错度不够。比如“scz@nsfocus”经原脚本编码产生“c2N6QG5zZm9jdXM==”，结尾多了一个=。

如果上下文都只用Danny Chouinard的原脚本，它的实现是最精简的。

下面是改过的版本，确保base64编码输出符合规范，以便与其他工具混合使用。其base64编码功能无法直接处理binary，只能处理“xxd -p”这类输入，允许出现空格。

暂时没有找到用awk直接处理binary的办法。

#!/usr/bin/awk -f

#
# Author : Danny Chouinard
# Modify : scz@nsfocus
#

function base64encode ()
{
    o       = 0;
    bits    = 0;
    n       = 0;
    count   = 0;
    while ( getline )
    {
        for ( c = 0; c < length( $0 ); c++ )
        {
            h   = index( "0123456789abcdef", substr( $0, c+1, 1 ) );
            if ( h-- )
            {
                count++;
                for( i = 0; i < 4; i++ )
                {
                    o   = o * 2 + int( h / 8 )
                    h   = ( h * 2 ) % 16;
                    if( ++bits == 6 )
                    {
                        printf substr( base64table, o+1, 1 );
                        if( ++n >= maxn )
                        {
                            printf( "\n" );
                            n   = 0;
                        }
                        o       = 0;
                        bits    = 0;
                    }
                }
            }
        }
    }
    if ( bits )
    {
        while ( bits++ < 6 )
        {
            o   = o * 2;
        }
        printf substr( base64table, o+1, 1 );
        if( ++n >= maxn )
        {
            printf( "\n" );
            n   = 0;
        }
    }
    count   = int( count / 2 ) % 3;
    if ( count )
    {
        for ( i = 0; i < 3 - count; i++ )
        {
            printf( "=" );
            if( ++n >= maxn )
            {
                printf( "\n" );
                n   = 0;
            }
        }
    }
    if ( n )
    {
        printf( "\n" );
    }
}

function base64decode ()
{
    o       = 0;
    bits    = 0;
    while( getline < "/dev/stdin" )
    {
        for ( i = 0; i < length( $0 ); i++ )
        {
            c   = index( base64table, substr( $0, i+1, 1 ) );
            if ( c-- )
            {
                for ( b = 0; b < 6; b++ )
                {
                    o   = o * 2 + int( c / 32 );
                    c   = ( c * 2 ) % 64;
                    if( ++bits == 8 )
                    {
                        printf( "%c", o );
                        o       = 0;
                        bits    = 0;
                    }
                }
            }
        }
    }
}

BEGIN   \
{
    base64table = "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/";
    maxn        = 76;

    if ( ARGV[1] == "d" )
    {
        base64decode();
    }
    else
    {
        base64encode();
    }
}

$ xxd -p some | awk -f base64.awk > some.txt
$ base64 some > some.txt

这两个输出完全相同。

base64解码时，必须关闭%c的UTF-8支持，下面两种办法均可:

$ LANG=C awk -f base64.awk d some
$ awk --characters-as-bytes -f base64.awk d some

base64.awk直接使用awk的printf()。如果这个awk实际是由busybox提供的，此时可能无法输出0x00，这点需要在目标环境实测:

可以调用shell的printf输出0x00，UTF-8困挠一并被规避，参看uudecode_ash.awk。

3) uudecode.awk

busybox提供的awk可能无法输出0x00，本脚本不适用于busybox环境。

#!/usr/bin/awk -f

function looktable ( l, p )
{
    uutable = "!\"#$%&'()*+,-./0123456789:;?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\\]^_";
    return index( uutable, substr( l, p+1, 1 ) );
}

/^[^be]/    \
{
    len = looktable( $0, 0 );
    for ( i = 1; len > 0; i += 4 )
    {
        a       = looktable( $0, i );
        b       = looktable( $0, i+1 );
        c       = looktable( $0, i+2 );
        d       = looktable( $0, i+3 );
        printf( "%c", a * 4 + b / 16 );
        if ( len > 1 )
        {
            printf( "%c", b * 16 + c / 4 );
            if ( len > 2 )
            {
                printf( "%c", c * 64 + d );
            }
        }
        len    -= 3;
    }
}

$ uuencode some some > some.txt

$ LANG=C awk -f uudecode.awk some
$ awk --characters-as-bytes -f uudecode.awk some

4) uudecode_ash.awk

#!/bin/awk -f

function looktable ( l, p )
{
    uutable = "!\"#$%&'()*+,-./0123456789:;?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\\]^_";
    return index( uutable, substr( l, p+1, 1 ) );
}

/^[^be]/    \
{
    len     = looktable( $0, 0 );
    n       = 1;
    for ( i = 1; len > 0; i += 4 )
    {
        a           = looktable( $0, i );
        b           = looktable( $0, i+1 );
        c           = looktable( $0, i+2 );
        d           = looktable( $0, i+3 );
        ret[n++]    = ( a * 4 + b / 16 ) % 256;
        if ( len > 1 )
        {
            ret[n++]    = ( b * 16 + c / 4 ) % 256;
            if ( len > 2 )
            {
                ret[n++]    = ( c * 64 + d ) % 256;
            }
        }
        len    -= 3;
    }
    escseq  = "";
    for ( i = 1; i in ret; i++ )
    {
        escseq = escseq sprintf( "\\x%02x", ret[i] );
        delete ret[i];
    }
    system( "printf \"" escseq "\"" );
}

$ uuencode some some > some.txt
$ busybox awk -f uudecode_ash.awk some

此处不需要LANG=C，并且可以输出0x00，适用于busybox环境。

5) base64_ash.awk

从base64.awk移植出一个可以在busybox(ash+awk)中使用的版本。

#!/bin/awk -f

#
# Author : Danny Chouinard
# Modify : scz@nsfocus
#

function base64encode ()
{
    o       = 0;
    bits    = 0;
    n       = 0;
    count   = 0;
    while ( getline )
    {
        for ( c = 0; c < length( $0 ); c++ )
        {
            h   = index( "0123456789abcdef", substr( $0, c+1, 1 ) );
            if ( h-- )
            {
                count++;
                for( i = 0; i < 4; i++ )
                {
                    o   = o * 2 + int( h / 8 )
                    h   = ( h * 2 ) % 16;
                    if( ++bits == 6 )
                    {
                        printf substr( base64table, o+1, 1 );
                        if( ++n >= maxn )
                        {
                            printf( "\n" );
                            n   = 0;
                        }
                        o       = 0;
                        bits    = 0;
                    }
                }
            }
        }
    }
    if ( bits )
    {
        while ( bits++ < 6 )
        {
            o   = o * 2;
        }
        printf substr( base64table, o+1, 1 );
        if( ++n >= maxn )
        {
            printf( "\n" );
            n   = 0;
        }
    }
    count   = int( count / 2 ) % 3;
    if ( count )
    {
        for ( i = 0; i < 3 - count; i++ )
        {
            printf( "=" );
            if( ++n >= maxn )
            {
                printf( "\n" );
                n   = 0;
            }
        }
    }
    if ( n )
    {
        printf( "\n" );
    }
}

function base64decode ()
{
    o       = 0;
    bits    = 0;
    while( getline < "/dev/stdin" )
    {
        n       = 1;
        for ( i = 0; i < length( $0 ); i++ )
        {
            c   = index( base64table, substr( $0, i+1, 1 ) );
            if ( c-- )
            {
                for ( b = 0; b < 6; b++ )
                {
                    o   = o * 2 + int( c / 32 );
                    c   = ( c * 2 ) % 64;
                    if( ++bits == 8 )
                    {
                        ret[n++]    = o;
                        o           = 0;
                        bits        = 0;
                    }
                }
            }
        }
        escseq  = "";
        for ( i = 1; i in ret; i++ )
        {
            escseq = escseq sprintf( "\\x%02x", ret[i] );
            delete ret[i];
        }
        system( "printf \"" escseq "\"" );
    }
}

BEGIN   \
{
    base64table = "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/";
    maxn        = 76;

    if ( ARGV[1] == "d" )
    {
        base64decode();
    }
    else
    {
        base64encode();
    }
}

$ busybox od -An -tx1 -v some | busybox awk -f base64_ash.awk > some.txt
$ busybox awk -f base64_ash.awk d < some.txt > some

此处不需要LANG=C，并且可以输出0x00，适用于busybox环境。

bash

1) xxd.sh

这个脚本要求bash 4.3或更高版本，充斥着bash的各种奇技淫巧，如果读不懂，请看bash(1)。

#!/bin/bash

#
# Read a file by bytes in BASH
# https://stackoverflow.com/questions/13889659/read-a-file-by-bytes-in-bash
#
# Author : F. Hauri
#        : 2016-09
#
# Modify : scz@nsfocus
#        : 2018-10-08
#        : 2018-10-11 15:12
#

function hexdump ()
{
    printf -v escseq \\%o {32..126}
    printf -v asciitable "$escseq"
    printf -v ctrltable %-20sE abtnvfr

    if [ "$1" == "-p" ] ; then
        printf -v spaceline %30s
        fmt=${spaceline// /%02x}
    else
        printf -v spaceline %16s
        fmt=${spaceline// / %02x}
    fi

    offset=0
    hexarray=()
    asciidump=

    while LANG=C IFS= read -r -d '' -n 1 byte
    do
        if [ "$byte" ] ; then
            printf -v escchar "%q" "$byte"
            case ${#escchar} in
            1|2 )
                index=${asciitable%$escchar*}
                hexarray+=($((${#index}+0x20)))
                asciidump+=$byte
                ;;
            5 )
                tmp=${escchar#*\'\\}
                index=${ctrltable%${tmp%\'}*}
                hexarray+=($((${#index}+7)))
                asciidump+=.
                ;;
            7 )
                tmp=${escchar#*\'\\}
                hexarray+=($((8#${tmp%\'})))
                asciidump+=.
                ;;
            * )
                echo >&2 Error: "[$escchar]"
                ;;
            esac
        else
            hexarray+=(0)
            asciidump+=.
        fi
        if [ "$1" == "-p" ] ; then
            if [ ${#hexarray[@]} -gt 29 ] ; then
                printf "$fmt\n" ${hexarray[@]}
                ((offset+=30))
                hexarray=()
                asciidump=
            fi
        else
            if [ ${#hexarray[@]} -gt 15 ] ; then
                printf "%08x:$fmt  %s\n" $offset ${hexarray[@]} "$asciidump"
                ((offset+=16))
                hexarray=()
                asciidump=
            fi
        fi
    done

    if [ "$hexarray" ] ; then
        if [ "$1" == "-p" ] ; then
            fmt="${fmt:0:${#hexarray[@]}*4}"
            printf "$fmt\n" ${hexarray[@]}
        else
            fmt="${fmt:0:${#hexarray[@]}*5}%$((48-${#hexarray[@]}*3))s"
            printf "%08x:$fmt  %s\n" $offset ${hexarray[@]} " " "$asciidump"
        fi
    fi
}

function revert ()
{
    hextable="0123456789abcdef"
    two=0
    hh=

    while LANG=C IFS= read -r -d '' -n 1 byte
    do
        if [ "$byte" ] ; then
            printf -v escchar "%q" "$byte"
            case ${#escchar} in
            1 )
                index=${hextable%${escchar,,}*}
                index=${#index}
                if [[ $index != 16 ]] ; then
                    ((two+=1))
                    hh+=$escchar
                    if [[ $two == 2 ]] ; then
                        printf -v escseq "\\\\x%s" $hh
                        printf $escseq
                        two=0
                        hh=
                    fi
                fi
                ;;
            * )
                ;;
            esac
        fi
    done
}

if [ "$1" != "-r" ] ; then
    hexdump $1
else
    revert
fi

脚本中的-d ”很重要，否则读取\n时，\n被自动转成\0。

$ ./xxd.sh -p some.txt
$ xxd -p some > some.txt

这两个输出完全相同

$ ./xxd.sh -r some
$ xxd -r -p some.txt some

这两个输出完全相同

“xxd.sh -r”的输入允许出现空格、换行等一切非16进制数字的字符，它们将被丢弃。

16进制数字大小写不敏感。

2) xxd_mini.sh

#!/bin/bash

#
# Author : scz@nsfocus
#        : 2018-10-08
#        : 2018-10-12 11:58
#

hexdump ()
{
    count=0

    while LANG=C IFS= read -r -d '' -n 1 byte
    do
        LANG=C printf '%02x' "'$byte"
        let count+=1
        if [ $count -eq 30 ] ; then
            printf "\n"
            count=0
        fi
    done
    if [ $count -ne 0 ] ; then
        printf "\n"
    fi
}

revert ()
{
    hextable="0123456789abcdef"
    two=0
    hh=

    while LANG=C IFS= read -r -n 1 byte
    do
        if [ "$byte" ] ; then
            index=${hextable%${byte}*}
            index=${#index}
            if [[ $index != 16 ]] ; then
                let two+=1
                hh=$hh$byte
                if [[ $two == 2 ]] ; then
                    printf "\x"$hh
                    two=0
                    hh=
                fi
            fi
        fi
    done
}

if [ "$1" != "-r" ] ; then
    hexdump $1
else
    revert
fi

这个脚本不支持带ascii区的hexdump，即不支持”xxd -g 1″的效果，但支持”xxd -p”、”xxd -r”的效果，作为上传、下载工具，足矣。

相比xxd.sh，xxd_mini.sh的语法有些陈旧，这是为了兼容ash，参看xxd.ash的说明。

$ ./xxd_mini.sh < some
$ xxd -p some

这两个输出完全相同

$ ./xxd_mini.sh < some | ./xxd_mini.sh -r | xxd -g 1
$ xxd -p some | xxd -r -p | xxd -g 1

这两个输出完全相同

3) base64.sh

#!/bin/bash

#
# Author : scz@nsfocus
#        : 2018-10-08
#        : 2018-10-16 17:46
#

function base64encode ()
{
    printf -v escseq \\%o {32..126}
    printf -v asciitable "$escseq"
    printf -v ctrltable %-20sE abtnvfr

    base64table="ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/"
    o=0
    bits=0
    n=0
    maxn=76
    count=0

    while LANG=C IFS= read -r -d '' -n 1 byte
    do
        if [ "$byte" ] ; then
            printf -v escchar "%q" "$byte"
            case ${#escchar} in
            1|2 )
                index=${asciitable%$escchar*}
                ((hh=${#index}+0x20))
                ;;
            5 )
                tmp=${escchar#*\'\\}
                index=${ctrltable%${tmp%\'}*}
                ((hh=${#index}+7))
                ;;
            7 )
                tmp=${escchar#*\'\\}
                ((hh=8#${tmp%\'}))
                ;;
            * )
                echo >&2 Error: "[$escchar]"
                ;;
            esac
        else
            hh=0
        fi
        ((count++))
        for ((i=0;i<8;i++))
        do
            ((o=o*2+hh/128))
            ((hh=hh*2%256))
            ((bits++))
            if [[ $bits == 6 ]] ; then
                printf ${base64table:$o:1}
                ((n++))
                if [ $n -ge $maxn ] ; then
                    printf "\n"
                    n=0
                fi
                o=0
                bits=0
            fi
        done
    done
    if [[ $bits != 0 ]] ; then
        while [ $bits -lt 6 ]
        do
            ((bits++))
            ((o*=2))
        done
        printf ${base64table:$o:1}
        ((n++))
        if [ $n -ge $maxn ] ; then
            printf "\n"
            n=0
        fi
    fi
    ((count=count%3))
    if [[ $count != 0 ]] ; then
        for ((i=0;i<3-count;i++))
        do
            printf "="
            ((n++))
            if [ $n -ge $maxn ] ; then
                printf "\n"
                n=0
            fi
        done
    fi
    if [ $n -ne 0 ] ; then
        printf "\n"
    fi
}

function base64decode ()
{
    base64table="ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/"
    o=0
    bits=0

    while LANG=C IFS= read -r -d '' -n 1 byte
    do
        if [ "$byte" ] ; then
            c=${base64table%${byte}*}
            c=${#c}
            if [[ $c != 64 ]] ; then
                #
                # printf "%#x\n" $c
                # continue
                #
                for ((b=0;b<6;b++))
                do
                    ((o=o*2+c/32))
                    ((c=c*2%64))
                    ((bits++))
                    if [[ $bits == 8 ]] ; then
                        printf -v escseq \\x5cx%x $o
                        printf $escseq
                        o=0
                        bits=0
                    fi
                done
            fi
        fi
    done
}

if [ "$1" != "-d" ] ; then
    base64encode
else
    base64decode
fi

$ echo -n -e "scz@nsfocus" | ./base64.sh
c2N6QG5zZm9jdXM=

$ ./base64.sh some.txt
$ ./base64.sh -d some

ash

bash很强大，而我们面临的很可能是busybox提供的ash，ash要比bash弱不少。

1) xxd.ash

#!/bin/ash

#
# Author : scz@nsfocus
#        : 2018-10-08
#        : 2018-10-12 11:58
#

hexdump ()
{
    count=0

    while LANG=C IFS= read -r -n 1 byte
    do
        LANG=C printf '%02x' "'$byte"
        let count+=1
        if [ $count -eq 30 ] ; then
            printf "\n"
            count=0
        fi
    done
    if [ $count -ne 0 ] ; then
        printf "\n"
    fi
}

revert ()
{
    hextable="0123456789abcdef"
    two=0
    hh=

    while LANG=C IFS= read -r -n 1 byte
    do
        if [ "$byte" ] ; then
            index=${hextable%${byte}*}
            index=${#index}
            if [[ $index != 16 ]] ; then
                let two+=1
                hh=$hh$byte
                if [[ $two == 2 ]] ; then
                    printf "\x"$hh
                    two=0
                    hh=
                fi
            fi
        fi
    done
}

if [ "$1" != "-r" ] ; then
    hexdump $1
else
    revert
fi

xxd.ash实际就是xxd_mini.sh，编写后者时已经充分考虑了ash与bash的兼容性。

为了进行递增操作，使用了let关键字，ash很可能不支持(())。

不要写function关键字，busybox v1.19.3不认，v1.27.2才认。

ash不支持-d、-N，因此xxd.ash中read时删除了-d ”，这导致脚本无法正确读取\n，读进来时被自动转换成\0，在ash中找不到规避办法。

ash不支持${parameter,,pattern}，无法将输入自动转换成小写，xxd.ash只能处理全小写的some.txt。

xxd.ash的revert()可用，hexdump()不能正确转储\n。如果some中不包含\n，则可使用xxd.ash的hexdump()。如果非要在弱环境中进行hexdump()，可以先用revert()上传一个静态链接的ELF，此处不展开讨论。

2) echohelper.c

busybox的ash支持“echo -n -e”，这可能是最笨的上传binary方案。写个辅助C程序将指定binary转换成一系列echo命令。

/*
 * gcc -Wall -pipe -O3 -s -o echohelper echohelper.c
 */
#include 
#include 
#include 
#include 
#include 
#include 

int main ( int argc, char * argv[] )
{
    int             ret = EXIT_FAILURE;
    int             fd, i, n;
    unsigned char   buf[16];

    fd  = open( argv[1], O_RDONLY, 0 );
    if ( fd  0 )
        {
            printf( "echo -n -e \"" );
            for ( i = 0; i  some.ash
$ busybox ash some.ash > some

some.ash形如:

echo -n -e "\x0\x1\x2\x3\x4\x5\x6\x7\x8\x9\xa\xb\xc\xd\xe\xf"
echo -n -e "\x10\x11\x12\x13\x14\x15\x16\x17\x18\x19\x1a\x1b\x1c\x1d\x1e\x1f"
echo -n -e "\x20\x21\x22\x23\x24\x25\x26\x27\x28\x29\x2a\x2b\x2c\x2d\x2e\x2f"
echo -n -e "\x30\x31\x32\x33\x34\x35\x36\x37\x38\x39\x3a\x3b\x3c\x3d\x3e\x3f"
echo -n -e "\x40\x41\x42\x43\x44\x45\x46\x47\x48\x49\x4a\x4b\x4c\x4d\x4e\x4f"
echo -n -e "\x50\x51\x52\x53\x54\x55\x56\x57\x58\x59\x5a\x5b\x5c\x5d\x5e\x5f"
echo -n -e "\x60\x61\x62\x63\x64\x65\x66\x67\x68\x69\x6a\x6b\x6c\x6d\x6e\x6f"
echo -n -e "\x70\x71\x72\x73\x74\x75\x76\x77\x78\x79\x7a\x7b\x7c\x7d\x7e\x7f"
echo -n -e "\x80\x81\x82\x83\x84\x85\x86\x87\x88\x89\x8a\x8b\x8c\x8d\x8e\x8f"
echo -n -e "\x90\x91\x92\x93\x94\x95\x96\x97\x98\x99\x9a\x9b\x9c\x9d\x9e\x9f"
echo -n -e "\xa0\xa1\xa2\xa3\xa4\xa5\xa6\xa7\xa8\xa9\xaa\xab\xac\xad\xae\xaf"
echo -n -e "\xb0\xb1\xb2\xb3\xb4\xb5\xb6\xb7\xb8\xb9\xba\xbb\xbc\xbd\xbe\xbf"
echo -n -e "\xc0\xc1\xc2\xc3\xc4\xc5\xc6\xc7\xc8\xc9\xca\xcb\xcc\xcd\xce\xcf"
echo -n -e "\xd0\xd1\xd2\xd3\xd4\xd5\xd6\xd7\xd8\xd9\xda\xdb\xdc\xdd\xde\xdf"
echo -n -e "\xe0\xe1\xe2\xe3\xe4\xe5\xe6\xe7\xe8\xe9\xea\xeb\xec\xed\xee\xef"
echo -n -e "\xf0\xf1\xf2\xf3\xf4\xf5\xf6\xf7\xf8\xf9\xfa\xfb\xfc\xfd\xfe\xff"

3) base64.ash

base64编码时不直接处理binary，处理“xxd -p”、“od -An -tx1 -v –width=30″这类输入，允许出现空格，只支持小写[a-f]。如果busybox没有提供od，base64.ash无法进行base64编码。base64.ash不直接处理binary，主要因为busybox的ash不支持-d，无法有效读取\n。

base64.ash进行base64解码时仅依赖busybox的ash，但效率极其低下。

#!/bin/ash

#
# Author : scz@nsfocus
#        : 2018-10-08
#        : 2018-10-16 16:12
#

base64encode ()
{
    base64table="ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/"
    hextable="0123456789abcdef"
    o=0
    bits=0
    n=0
    maxn=76
    count=0

    while LANG=C IFS= read -r -n 1 byte
    do
        if [ "$byte" ] ; then
            h=${hextable%${byte}*}
            h=${#h}
            if [[ $h != 16 ]] ; then
                let count+=1
                i=0
                while [ $i -lt 4 ]
                do
                    let o=o*2+h/8
                    let h=h*2%16
                    let bits+=1
                    if [[ $bits == 6 ]] ; then
                        printf ${base64table:$o:1}
                        let n+=1
                        if [ $n -ge $maxn ] ; then
                            printf "\n"
                            n=0
                        fi
                        o=0
                        bits=0
                    fi
                    let i+=1
                done
            fi
        fi
    done
    if [[ $bits != 0 ]] ; then
        while [ $bits -lt 6 ]
        do
            let bits+=1
            let o*=2
        done
        printf ${base64table:$o:1}
        let n+=1
        if [ $n -ge $maxn ] ; then
            printf "\n"
            n=0
        fi
    fi
    let count=count/2%3
    if [[ $count != 0 ]] ; then
        i=0
        let t=3-count
        while [ $i -lt $t ]
        do
            printf "="
            let n+=1
            if [ $n -ge $maxn ] ; then
                printf "\n"
                n=0
            fi
            let i+=1
        done
    fi
    if [ $n -ne 0 ] ; then
        printf "\n"
    fi
}

base64decode ()
{
    base64table="ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/"
    o=0
    bits=0

    while LANG=C IFS= read -r -n 1 byte
    do
        if [ "$byte" ] ; then
            c=${base64table%${byte}*}
            c=${#c}
            if [[ $c != 64 ]] ; then
                b=0
                while [ $b -lt 6 ]
                do
                    let o=o*2+c/32
                    let c=c*2%64
                    let bits+=1
                    if [[ $bits == 8 ]] ; then
                        escseq=$(printf "\x%02x" $o)
                        printf $escseq
                        o=0
                        bits=0
                    fi
                    let b+=1
                done
            fi
        fi
    done
}

if [ "$1" != "-d" ] ; then
    base64encode
else
    base64decode
fi

$ echo -n -e "scz@nsfocus" | xxd -p | busybox ash base64.ash

$ busybox od -An -tx1 -v some | busybox ash base64.ash > some.txt
$ busybox ash base64.ash -d some

openssl

openssl可以进行base64编解码。一般不考虑目标环境存在openssl，列于此处只是出于完备性考虑。

$ openssl enc -base64 -e -in some -out some.txt
$ openssl enc -base64 -d -in some.txt -out some 
$ base64 -d some.txt > some

小结

至此为止，前面介绍的都是bin与txt的相互转换，各种编码、解码。假设数据传输通道只有一个弱shell，有回显，可以通过copy/paste无损传输可打印字符。为了将不可打印字节传输过去，只能通过编解码进行数据映射。前文只演示了3种数据映射方案，有更多其他编解码方案，但没必要，这3种够用了。

弱环境使得无法用C代码完成编解码，只能用一些受限的现有工具完成，为此上场了各种奇技淫巧。

后面的内容是一些相关发散。

perl

1) nc.pl

nc.pl实现nc部分功能。

#!/usr/bin/perl

use IO::Socket;

$SIG{PIPE}  = 'IGNORE';
$buflen     = 102400;

die "Usage: $0  \n" unless ($host = shift) && ($port = shift);

die "connect to $host:$port: $!\n" unless
    $sock   = new IO::Socket::INET
    (
        PeerAddr    => $host,
        PeerPort    => $port,
        proto       => 'tcp'
    );

while ( ( $count = sysread( STDIN, $buffer, $buflen ) ) > 0 )
{
    die "socket write error: $!\n" unless syswrite( $sock, $buffer, $count ) == $count;
}
die "socket read error: $!\n" if $count < 0;
die "close socket: $!\n " unless close( $sock );

本文最初没打算把perl牵扯进来，一般有perl的环境都不算弱环境。事实上前面主要考虑没有网络的串口登录shell，而且优先考虑恶劣的弱busybox环境。

后来想起曾经处理过一台x64/Solaris，当时需要取证，不允许在上面额外安装二进制工具，系统中没有nc，但有perl解释器。虽然这个场景不够恶劣，也算有所限制。

$ dd if=/dev/dsk/cNtNdNs2 | nc.pl

用这个办法把硬盘dd走了。

SecureCRT

这里介绍ZMODEM/YMODEM/XMODEM/KERMIT方案，某些场景用得上，包括U-Boot，但举例时用了Windows和Linux。

1) 从Windows向Linux上传文件

1.1) ZMODEM(推荐)

在Linux中安装lrzsz包:

$ aptitude install lrzsz

假设在Windows中用SecureCRT SSH登录Linux，在Linux的当前shell中切换到用于存放上传文件的目录，比如:

$ cd /tmp/modem/

在Windows中操作SecureCRT:

Transfer->Zmodem Upload List->选择多个待上传文件->Start Upload

之后在/tmp/modem中将出现被上传文件。

整个过程会在Linux中隐式执行rz:

$ rz
rz waiting to receive.
Starting zmodem transfer. Press Ctrl+C to cancel.
Transferring <file>...

第二种操作方式，SecureCRT SSH登录Linux，在Linux中切换目录，在Windows中用鼠标拖放待上传文件到SecureCRT SSH会话窗口，此时会弹出一个小窗口，在其中选择“Send Zmodem”。

第三种操作方式，SecureCRT SSH登录Linux，在Linux中切换目录，在Linux中执行rz命令，在SecureCRT中弹出界面让你选择文件，确定后完成上传。

1.2) YMODEM

相比ZMODEM，YMODEM、XMODEM没有优势，这里只是演示，并不推荐。

在Linux中执行:

$ rb -b

在Windows中操作SecureCRT:

Options->Session Options->Terminal->X/Y/Zmodem->X/Ymodem send packet size

128 bytes // 缺省值
1024 bytes (Xmodem-1k/Ymodem-1k) // 选这个

Transfer->Send Ymodem->选择文件

或者用鼠标拖放文件到相应SecureCRT会话窗口。YMODEM比ZMODEM慢，在Debian中居然需要用Ctrl-C结束，不过不影响上传数据。

1.3) XMODEM(不推荐)

在Linux中执行:

$ rx -b some

rx一次只能接收一个文件。

在Windows中操作SecureCRT:

Options->Session Options->Terminal->X/Y/Zmodem->X/Ymodem send packet size->1024 bytes (Xmodem-1k/Ymodem-1k)
Transfer->Send Xmodem->选择文件

待上传文件在Windows中的名字不要求是some，但到了Linux中将被重命名为some。在Debian中同样可能需要用Ctrl-C结束，但不影响上传数据。

相比ZMODEM、YMODEM，XMODEM有个大问题，在man中写道:

Up to 1023 garbage characters may be added to the received file.

尾部填充导致不宜用XMODEM上传binary，尽管可以用dd切掉尾部填充。ZMODEM、YMODEM无此问题。

1.4) KERMIT

介绍ZMODEM的文章很多，介绍KERMIT的较少，看到过标题说是介绍KERMIT内容实际是ZMODEM的文章，真扯淡。

在Linux中安装ckermit包:

$ aptitude install ckermit

在Linux中执行:

$ kermit -i -r

在Windows中操作SecureCRT:

Transfer->Send Kermit->选择文件(可以多选)

或者用鼠标拖放文件到相应SecureCRT会话窗口，在弹出窗口中选择”Send Kermit”。

2) 从Linux向Windows下载文件

2.1) ZMODEM(推荐)

假设在Windows中用SecureCRT SSH登录Linux

在Windows中操作SecureCRT:

Options->Session Options->Terminal->X/Y/Zmodem->Directories->Download->指定用于存放下载文件的目录

不必理会Upload的设置

在Linux中执行:

$ sz -b zmodem.bin other.bin
rz
Starting zmodem transfer. Press Ctrl+C to cancel.
Transferring zmodem.bin...
...
Transferring other.bin...
...

在Windows中检查Download目录，已经出现被下载文件。

SecureCRT对sz的支持比较智能，没有想像中的:

Transfer->Receive Zmodem

这带来一些兼容性问题。某远程主机是一台嵌入式ARM/Linux，上面有个3.48版sz，远程执行“sz -b <file>”后，SecureCRT这边没反应，但用YMODEM下载成功。后来把源自Debian 9的lrzsz 0.12.21-10交叉编译出静态链接版本弄到前述ARM/Linux上，用ZMODEM下载成功。

2.2) YMODEM

在Linux中执行:

$ sb -b -k ymodem.bin other.bin

在Windows中操作SecureCRT:

Options->Session Options->Terminal->X/Y/Zmodem->Directories->Download->指定用于存放下载文件的目录
Options->Session Options->Terminal->X/Y/Zmodem->X/Ymodem send packet size->1024 bytes (Xmodem-1k/Ymodem-1k)
Transfer->Receive Ymodem

在Windows中检查Download目录，已经出现被下载文件。

2.3) XMODEM(不推荐)

在Linux中执行:

$ sx -b -k xmodem.bin

sx一次只能传送一个文件。

在Windows中操作SecureCRT:

Options->Session Options->Terminal->X/Y/Zmodem->Directories->Download->指定用于存放下载文件的目录
Options->Session Options->Terminal->X/Y/Zmodem->X/Ymodem send packet size->1024 bytes (Xmodem-1k/Ymodem-1k)
Transfer->Receive Xmodem

与2.2小节不同，此处弹出文件对话框，让你选择输出目录，还可以指定输出文件名。

1.3小节提到的尾部填充(0x1a)并不是Linux版rx命令的独有表现，应该是XMODEM规范。

SecureCRT通过XMODEM接收文件时，同样会进行尾部填充。填充什么数据，填充多少字节，可以看rx源码，我已经打定主意不用XMODEM，不深究。

2.4) KERMIT

在Linux中执行:

$ kermit -I -P -i -s kermit.bin other.bin

指定-P，否则文件下载到Windows后文件名变成全大写。

在Windows中操作SecureCRT:

Options->Session Options->Terminal->X/Y/Zmodem->Directories->Download->指定用于存放下载文件的目录
Transfer->Receive Kermit

在Windows中检查Download目录，已经出现被下载文件。SecureCRT没有单独为KERMIT配置下载目录的地方，KERMIT与ZMODEM共用同一个下载目录。

zssh

若A、B都是Linux，也可以用rz/sz上传下载，此时需要zssh。zssh是”Zmodem SSH”的缩写，Debian有这个包，直接装就是。

$ apt-cache search zssh
$ dpkg -L zssh | grep "/bin/"
/usr/bin/zssh
/usr/bin/ztelnet

man手册里有:

zssh is an interactive wrapper for ssh used to switch the ssh connection between the remote shell and file transfers. This is achieved by using another tty/pty pair between the user and the local ssh process to plug either the user’s tty (remote shell mode) or another process (file transfer mode) on the ssh connection.

ztelnet behaves similarly to zssh, except telnet is used instead of ssh.It is equivalent to ‘zssh -s “telnet -8 -E”‘

$ zssh <user>@<ip>

登录后，在远程shell里执行:

$ sz zmodem.bin other.bin
**B00000000000000

按下zssh的”escape squence”，缺省是Ctrl-@(或Ctrl-2)。这将进入另一个提示符，在其中输入rz

zssh > rz

即可完成下载。此处有坑，假设是在C中用SecureCRT远程登录A，该会话启用ZMODEM，前述操作原始意图是从B向A提供文件，实际效果是从B向C提供文件；这种场景下，为了达成原始意图，必须先禁用C与A之间的ZMODEM。

上传更简单，在”zssh >”提示符下执行sz:

zssh > sz zmodem.bin other.bin

上传时跟SecureCRT一样”智能”，不需要在远程shell里显式执行rz来配合。