Linux

Split Command in Linux: 9 Useful Examples

The split command in Linux allows you to split files into multiple files. There are several ways you can customize parameters for your given application. I’ll show you some examples of the split command that will help you understand its usage.

To help you learn about the split command I am using a relatively large text file containing 17170 lines and 1.4 MB in size. You can download a copy of this file from the GitHub link below.

Note that I will not directly display output in these examples because of the large file sizes. I will use the ll and wc commands to highlight file changes.

I advise you to have a quick look at the wc command to understand the output of the split command examples.

Examples of Split command in Linux

Split Command Linux

This is the syntax of the Split command:

split [options] filename [prefix]

Let’s see how to use it to split files in Linux.

1. Split files into multiple files

By default, split command creates new files for each 1000 lines. If no prefix is specified, it will use ‘x’. The letters that follow enumerate the files therefore xaa comes first, then xab, and so on.

Let’s split the sample log file:

split someLogFile.log

If you use the ls command, you can see multiple new files in your directory.

[email protected]:~/Documents$ ls
someLogFile.log  xab  xad  xaf  xah  xaj  xal  xan  xap  xar
xaa              xac  xae  xag  xai  xak  xam  xao  xaq

You can use wc to quickly check the line counts after splitting.

[email protected]:~/Documents$ wc -l xaa xaq xar
1000 xaa
1000 xaq
170 xar

Remember from earlier that we saw our initial file had 17,170 lines. So we can see our program has done as expected by creating 18 new files. 17 of them are filled with 1000 lines each, and the last one has the remaining 170 lines.

Another way that we can demonstrate what is happening is to run the command with the verbose option. If you’re unfamilar with verbose, you are missing out! It provides more detailed feedback about what your system is doing and it is available to use with many commands.

split someLogFile.log --verbose

You can see what’s going on with your command on the display:

creating file 'xaa'
creating file 'xab'
creating file 'xac'
creating file 'xad'
creating file 'xae'
creating file 'xaf'
creating file 'xag'
creating file 'xah'
creating file 'xai'
creating file 'xaj'
creating file 'xak'
creating file 'xal'
creating file 'xam'
creating file 'xan'
creating file 'xao'
creating file 'xap'
creating file 'xaq'
creating file 'xar'

2. Split files into multiple files with specific line numbers

I understand that you might not like that files are split into files of 1000 lines. You can changes this behavior with -l option.

When this is added, you can now specify how many lines you want in each of the new files.

split someLogFile.log -l 500

As you can guess, now the split files have 500 lines each, except the last one.

[email protected]:~/Documents$ wc -l xbh xbi
500 xbh
170 xbi

Now you have many more files, but with half as many lines in each one.

3. Split the files into n number of files

The -n option makes splitting into a designated number of pieces or chunks easy. You can assign how many files you want by adding an integer value after -n.

split someLogFile.log -n 15

Now you can see that there are 15 new files.

[email protected]:~/Documents$ ls
someLogFile.log  xaa  xab  xac  xad  xae  xaf  xag  xah  xai  xaj  xak  xal  xam  xan  xao

4. Split files with custom name prefix

What if you want to use split but keep the original name of my file or make a new name altogether instead of using ‘x’?

You may remember seeing the prefix as part of the syntax described in the beginning of the article. You can write your own custom file name after the source file.

split someLogFile.log someSeparatedLogFiles.log_

Here are the split files with names starting with the given prefix.

[email protected]:~/Documents$ ls
someLogFile.log               someSeparatedLogFiles.log_aj
someSeparatedLogFiles.log_aa  someSeparatedLogFiles.log_ak
someSeparatedLogFiles.log_ab  someSeparatedLogFiles.log_al
someSeparatedLogFiles.log_ac  someSeparatedLogFiles.log_am
someSeparatedLogFiles.log_ad  someSeparatedLogFiles.log_an
someSeparatedLogFiles.log_ae  someSeparatedLogFiles.log_ao
someSeparatedLogFiles.log_af  someSeparatedLogFiles.log_ap
someSeparatedLogFiles.log_ag  someSeparatedLogFiles.log_aq
someSeparatedLogFiles.log_ah  someSeparatedLogFiles.log_ar
someSeparatedLogFiles.log_ai

5. Split and Specify Suffix Length

Split features a default suffix length of 2 [aa, ab, etc.]. This will change automatically as the number of files increases, but if you would like to manually change it, that is possible too. So let’s say you want our files to be named something like someSeparatedLogFiles.log_aaaab.

How can you do this? The option -a allows us to specify the length of the suffix.

split someLogFile.log someSeparatedLogFiles.log_ -a 5

And here are the split files:

[email protected]:~/Documents$ ls
someLogFile.log                  someSeparatedLogFiles.log_aaaae  someSeparatedLogFiles.log_aaaaj  someSeparatedLogFiles.log_aaaao
someSeparatedLogFiles.log_aaaaa  someSeparatedLogFiles.log_aaaaf  someSeparatedLogFiles.log_aaaak  someSeparatedLogFiles.log_aaaap
someSeparatedLogFiles.log_aaaab  someSeparatedLogFiles.log_aaaag  someSeparatedLogFiles.log_aaaal  someSeparatedLogFiles.log_aaaaq
someSeparatedLogFiles.log_aaaac  someSeparatedLogFiles.log_aaaah  someSeparatedLogFiles.log_aaaam  someSeparatedLogFiles.log_aaaar
someSeparatedLogFiles.log_aaaad  someSeparatedLogFiles.log_aaaai  someSeparatedLogFiles.log_aaaan

6. Split with numeric order suffix

Up to this point, you have seen your files separated using different letter combinations. Personally, I find it much easier to distinguish files using numbers.

Let’s keep the suffix length from the previous example, but change the alphabetical organization to numeric with the option -d.

split someLogFile.log someSeparatedLogFiles.log_ -a 5 -d

So now you will have split files with numerical suffices.

[email protected]:~/Documents$ ls
someLogFile.log                  someSeparatedLogFiles.log_00004  someSeparatedLogFiles.log_00009  someSeparatedLogFiles.log_00014
someSeparatedLogFiles.log_00000  someSeparatedLogFiles.log_00005  someSeparatedLogFiles.log_00010  someSeparatedLogFiles.log_00015
someSeparatedLogFiles.log_00001  someSeparatedLogFiles.log_00006  someSeparatedLogFiles.log_00011  someSeparatedLogFiles.log_00016
someSeparatedLogFiles.log_00002  someSeparatedLogFiles.log_00007  someSeparatedLogFiles.log_00012  someSeparatedLogFiles.log_00017
someSeparatedLogFiles.log_00003  someSeparatedLogFiles.log_00008  someSeparatedLogFiles.log_00013

7. Append hex suffixes to split files

Another option for suffix creation is to use in the built-in hex suffix which alternates ordered letters and numbers.

For this example, I will combine a few things I’ve already shown you. I will split the file using my own prefix. I chose an underscore for readability purposes.

I used the -x option to create a hex suffix. Then I split our file into 50 chunks and gave the suffix a length of 6.

split someLogFile.log _ -x -n50 -a6

And here is the outcome of the above command:

[email protected]:~/Documents$ ls
_000000  _000003  _000006  _000009  _00000c  _00000f  _000012  _000015  _000018  _00001b  _00001e  _000021  _000024  _000027  _00002a  _00002d  _000030
_000001  _000004  _000007  _00000a  _00000d  _000010  _000013  _000016  _000019  _00001c  _00001f  _000022  _000025  _000028  _00002b  _00002e  _000031
_000002  _000005  _000008  _00000b  _00000e  _000011  _000014  _000017  _00001a  _00001d  _000020  _000023  _000026  _000029  _00002c  _00002f  someLogFile.log

8. Split files into multiple files of specific size

It’s also possible to use file size to break up files in split. Maybe you need to send a large file over a size-capped network as efficiently as possible. You can specify the exact size for your requirements.

The syntax can get a little tricky as we continue to add options. So, I will explain how the -b command works before showing the example.

When you want to create files of a specific size, use the -b option. You can then write nK[B], nM[B], nG[B] where n is the value of your file size and K [1024] is -kibi, M is -mebi, G is -gibi, and so on. KB [1000] is kilo, MB – mega etc.

It may look like there is a lot going on, but it’s not that complex when you break it down. You have specified the source file, our destination filename prefix, a numeric suffix, and separation by file size of 128kB.

split someLogFile.log someSeparatedLogFiles.log_ -d -b 128KB

Here are the split files:

[email protected]:~/Documents$ ls
someLogFile.log               someSeparatedLogFiles.log_02  someSeparatedLogFiles.log_05  someSeparatedLogFiles.log_08
someSeparatedLogFiles.log_00  someSeparatedLogFiles.log_03  someSeparatedLogFiles.log_06  someSeparatedLogFiles.log_09
someSeparatedLogFiles.log_01  someSeparatedLogFiles.log_04  someSeparatedLogFiles.log_07  someSeparatedLogFiles.log_10

You can verify the result with the ‘wc’ command.

[email protected]:~/Documents$ wc someSeparatedLogFiles.log_0*
1605    4959  128000 someSeparatedLogFiles.log_00
1605    4969  128000 someSeparatedLogFiles.log_01
1605    4953  128000 someSeparatedLogFiles.log_02
1605    4976  128000 someSeparatedLogFiles.log_03
1605    4955  128000 someSeparatedLogFiles.log_04
1605    4975  128000 someSeparatedLogFiles.log_05
1605    4966  128000 someSeparatedLogFiles.log_06
1605    4964  128000 someSeparatedLogFiles.log_07
1605    4968  128000 someSeparatedLogFiles.log_08
1605    4959  128000 someSeparatedLogFiles.log_09
16050   49644 1280000 total

9. Split files into multiple files of ‘At Most’ size n with

If you wanted to split files into roughly the same size, but preserve the line structure, this might be the best choice for you. With -C, you can specify a maximum size. Then the program will automatically split the files based on complete lines.

split someLogFile.log someNewLogFiles.log_ -d -C 1MB

You can see in the output that the first split file is of nearly 1MB in size where as the rest of the file is in the second file.

[email protected]:~/Documents$ ll
total 2772
drwxr-xr-x  2 chris chris   81920 Jul 24 22:01 ./
drwxr-xr-x 19 chris chris    4096 Jul 23 22:23 ../
-rw-r--r--  1 chris chris 1369273 Jul 20 17:52 someLogFile.log
-rw-r--r--  1 chris chris  999997 Jul 24 22:01 someNewLogFiles.log_00
-rw-r--r--  1 chris chris  369276 Jul 24 22:01 someNewLogFiles.log_01

Bonus Tip: Rejoining split files

This isn’t a split command, but it might be helpful for new users.

[email protected]:~/Documents$ ls
xaa  xab  xac  xad  xae  xaf  xag  xah  xai  xaj  xak  xal  xam  xan  xao  xap  xaq  xar

You can use another command to rejoin those files and create a replica of our complete document. The cat command is short for concatenate which is just a fancy word that means “join items together”. Since all of the files begin with the letter ‘x’, the asterisk will apply the command to any files that begin with that letter.

[email protected]:~/Documents$ cat x* > recoveredLogFile.log
[email protected]:~/Documents$ ls
recoveredLogFile.log  xab  xad  xaf  xah  xaj  xal  xan  xap  xar
xaa                   xac  xae  xag  xai  xak  xam  xao  xaq

As you can see, our recreated file is the same size as our original.

wc -l recreatedLogFile.log
17170 recreatedLogFile.log

Our formatting (including the number of lines) is preserved in the file created.

If you’re new to Linux, I hope this tutorial helped you in understanding the split command. If you are more experienced tell us your favorite way to use split in the comments below!

 

Tags
Show More

Leave a Reply

Your email address will not be published. Required fields are marked *

Back to top button
Close