Linux & Open Source

Convert DBF to CSV on Linux with Perl and XBase

I spent a few days to figure out how to make a simple conversion of a DBF file into a plain text file as comma separated values (CSV) or tab separated values (TSV) in a batch/command line way. I was almost setting up an OpenOffice.org server because it seamed to be the only packaged solution to read and convert DBFs.

Well, it easier than that. The secret lives in the XBase drivers for Perl Data Base Interface project. To install it on Fedora or Red Hat use this command:

yum install perl-DBD-XBase

If you use other Linux distributions, you may look for a package with a similar name.

For a simple file conversion, you don’t need to program in Perl because the perl-DBD-XBase package contains a command line tool for that: dbftool.pl. So to convert the mytable.dbf file into a more standard CSV file, do this:

dbfdump.pl --fs "," mytable.dbf > mytable.csv

If you want to use TAB instead of a comma to separate fields, while editing the command put the cursor between the double quotes, press Ctrl+V and then press the TAB key on your keyboard. This will fill the double quotes with a real (but invisible) TAB char.

8 comments to Convert DBF to CSV on Linux with Perl and XBase

  • simon

    I’m on Ubuntu 14.04 LTS. I installed libdbd-xbase-perl via Synaptic. Then trying to do dbfdump, I got the following:
    snoopie2@snoopie2-Compaq-Presario-CQ40-Notebook-PC:~/special-tools/sage-6.2-i686-Linux$ dbfdump
    程序“dbfdump”尚未安装。 您可以使用以下命令安装:
    sudo apt-get install shapelib
    snoopie2@snoopie2-Compaq-Presario-CQ40-Notebook-PC:~/special-tools/sage-6.2-i686-Linux$ sudo apt-get install shapelib[sudo] password for snoopie2:
    正在读取软件包列表… 完成
    正在分析软件包的依赖关系树
    正在读取状态信息… 完成
    将会安装下列额外的软件包:
    libshp1
    下列【新】软件包将被安装:
    libshp1 shapelib
    升级了 0 个软件包,新安装了 2 个软件包,要卸载 0 个软件包,有 30 个软件包未被升级。
    需要下载 59.2 kB 的软件包。
    解压缩后会消耗掉 217 kB 的额外空间。
    您希望继续执行吗? [Y/n] y
    获取:1 http://hk.archive.ubuntu.com/ubuntu/ trusty/universe libshp1 i386 1.2.10-7 [22.8 kB]
    获取:2 http://hk.archive.ubuntu.com/ubuntu/ trusty/universe shapelib i386 1.2.10-7 [36.4 kB]
    下载 59.2 kB,耗时 1秒 (40.1 kB/s)
    Selecting previously unselected package libshp1:i386.
    (正在读取数据库 … 系统当前共安装有 237520 个文件和目录。)
    Preparing to unpack …/libshp1_1.2.10-7_i386.deb …
    Unpacking libshp1:i386 (1.2.10-7) …
    Selecting previously unselected package shapelib.
    Preparing to unpack …/shapelib_1.2.10-7_i386.deb …
    Unpacking shapelib (1.2.10-7) …
    Processing triggers for man-db (2.6.7.1-1) …
    正在设置 libshp1:i386 (1.2.10-7) …
    正在设置 shapelib (1.2.10-7) …
    Processing triggers for libc-bin (2.19-0ubuntu6) …
    snoopie2@snoopie2-Compaq-Presario-CQ40-Notebook-PC:~/special-tools/sage-6.2-i686-Linux$ dbfdump
    dbfdump [-h] [-r] [-m] xbase_file
    -h: Write header info (field descriptions)
    -r: Write raw field info, numeric values not reformatted
    -m: Multiline, one line per field.
    snoopie2@snoopie2-Compaq-Presario-CQ40-Notebook-PC:~/special-tools/sage-6.2-i686-Linux$ man dbfdump
    snoopie2@snoopie2-Compaq-Presario-CQ40-Notebook-PC:~/special-tools/sage-6.2-i686-Linux$
    It turned out you need to install shapelib to get dbfdump in Ubuntu 14.04.

  • Leonardo

    Hi. Thank you very much!! it’s helped me alot!

  • Thanks wintel. Used the dbf_dump command from the libdbd-xbase-perl package on Ubuntu to good effect.

  • Josias

    Some work still needs to be done on the intelligent checking of input fields but so far I have tested the following and it worked.
    Thanks again to this site, and linuxcommand.org, and man and info pages

    # warning! Script needs intelligence

    # use dbfdump to convert DB2000 student database to comma-seperated values
    # find the new or changed student entries and put in a list of changes
    # replace the known student list
    # rip out and reorder the fields from the changes list into a file for lat-users
    # rip out and add user names to filter list to allow internet access then reload danguardian filter groups
    # run lazy admin tools to import data into sme-server samba and linux users
    # Update samba domain group maps

    dbfdump –fs “,” /home/deltacopy/db2000/MC$(date +%Y)/Student.dbf > /root/students.new
    diff –old-line-format=” –unchanged-line-format=” –new-line-format=’%L’ /root/students.old /root/students.new > /root/students.changed
    mv students.new students.old -f
    gawk ‘BEGIN { FS = “,” ; OFS = “|” } \
    { print “s”$1,$4,$3,$2,””,””,””,””,””,””,””,””,”students”,”power” >”/root/newstudents.csv” }’ /root/students.changed
    gawk ‘BEGIN { FS = “,” ; OFS = “=” } \
    { print “s”$1,”filter2 # students” >>”/etc/danguardian/lists/filtergroupslist” }’ /root/students.changed
    dansguardian -g
    lat-users -a -i /root/newstudents.csv &&\
    rm -rf /root/newstudents.csv &&\
    /etc/e-smith/events/actions/update-domain-group-maps

  • Josias

    Thanks, I have been wondering for years how to insert data from a .dbf into samba and linux server, this has helped considerably. Next step, using diff to check for changes and then to use sed or gawk to take certain fields from the changes for input into a lazy-admin-tools input file. I might need to use read/readline too.
    Learning is fun.

  • Mark LaPierre

    Commented out O_BINARY as suggested

    [mlapier@mushroom]$ dbfdump.pl –fs “,” 20110205dailydat.dbf > 20110205dailydat.csv
    Possible attempt to put comments in qw() list at /usr/lib/perl5/Fcntl.pm line 140.
    Prototype mismatch: sub XBase::Base::O_BINARY () vs none at (eval 5) line 1.
    Constant subroutine O_BINARY redefined at (eval 5) line 1.
    [mlapier@mushroom]$

    I’m open for suggestions? ;-)

    • Avi

      I have no idea. I used it only once or twice.

      I would guess that your DBF has binary fields which are incompatible with such conversion because CSV is text only.

      If so, I would guess you need to use the XBase Perl API instead of using the vanilla dump tool.

      Another guess is the encoding of text on your DBF file is incompatible with your environment so the tool is interpreting it as binary.

      But again, these are only guesses.

  • wintel

    I have just happened to work on some dbf dbase files too recently, and find out your post is very helpful (I plan to use perl dbi at the beginning). Here are some extra works I have done if you run windows, or other distributions of linux, such as opensuse, debian, ubuntu… hope is helpful for anyone who meet the same problem.

    ATTENTION: make sure you use –nomemo switch when you run dbfdump, otherwirse you will get error message (missing dbt).

    WINDOWS XP: install activeperl, and then install dbd-xbase package under ppm-shell. then run dbfdump.bat (mine is under c:\perl\site\bin)

    dbfdump.bat –fs=:|” –nomemo mydebasefile.dbf

    LINUX, install perl-dbd-xbase (suse), libdbd-xbase-perl (debian or ubuntu), and run

    dbfdump.bat –fs=:|” –nomemo mydebasefile.dbf

    for some version incompatible reason, you may get the error message under linux

    prototype mismatch: sub Xbase::Base::O_BINARY () vs none at (eval 4). what I did is find the Fcntl.pm file for perl, and comment out the O_BINARY line, it works. If you need more information, google could help you more.

Leave a Reply

  

  

  

You can use these HTML tags

<a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>