Browse Source

This commit was generated by cvs2svn to compensate for changes in r2,

which included commits to RCS files with non-trunk default branches.


git-svn-id: https://svn.code.sf.net/p/jackcess/code/jackcess/trunk@3 f203690c-595d-4dc9-a70b-905162fa7fd2
tags/rel_1_1
Tim McCune 19 years ago
parent
commit
4f8ce372b4
32 changed files with 5527 additions and 0 deletions
  1. 2
    0
      .cvsignore
  2. 459
    0
      license.txt
  3. 18
    0
      project.properties
  4. 97
    0
      project.xml
  5. 114
    0
      src/java/com/healthmarketscience/jackcess/ByteUtil.java
  6. 549
    0
      src/java/com/healthmarketscience/jackcess/Column.java
  7. 94
    0
      src/java/com/healthmarketscience/jackcess/DataTypes.java
  8. 717
    0
      src/java/com/healthmarketscience/jackcess/Database.java
  9. 506
    0
      src/java/com/healthmarketscience/jackcess/Index.java
  10. 98
    0
      src/java/com/healthmarketscience/jackcess/InlineUsageMap.java
  11. 302
    0
      src/java/com/healthmarketscience/jackcess/JetFormat.java
  12. 88
    0
      src/java/com/healthmarketscience/jackcess/NullMask.java
  13. 135
    0
      src/java/com/healthmarketscience/jackcess/PageChannel.java
  14. 43
    0
      src/java/com/healthmarketscience/jackcess/PageTypes.java
  15. 118
    0
      src/java/com/healthmarketscience/jackcess/ReferenceUsageMap.java
  16. 559
    0
      src/java/com/healthmarketscience/jackcess/Table.java
  17. 239
    0
      src/java/com/healthmarketscience/jackcess/UsageMap.java
  18. 151
    0
      src/java/com/healthmarketscience/jackcess/scsu/Debug.java
  19. 46
    0
      src/java/com/healthmarketscience/jackcess/scsu/EndOfInputException.java
  20. 429
    0
      src/java/com/healthmarketscience/jackcess/scsu/Expand.java
  21. 45
    0
      src/java/com/healthmarketscience/jackcess/scsu/IllegalInputException.java
  22. 252
    0
      src/java/com/healthmarketscience/jackcess/scsu/SCSU.java
  23. BIN
      src/resources/com/healthmarketscience/jackcess/empty.mdb
  24. 6
    0
      src/resources/com/healthmarketscience/jackcess/log4j.properties
  25. 1
    0
      test/data/sample-input-only-headers.tab
  26. 3
    0
      test/data/sample-input.tab
  27. BIN
      test/data/test.mdb
  28. 203
    0
      test/src/java/com/healthmarketscience/jackcess/DatabaseTest.java
  29. 45
    0
      test/src/java/com/healthmarketscience/jackcess/ImportTest.java
  30. 51
    0
      test/src/java/com/healthmarketscience/jackcess/TableTest.java
  31. 110
    0
      xdocs/faq.fml
  32. 47
    0
      xdocs/index.xml

+ 2
- 0
.cvsignore View File

@@ -0,0 +1,2 @@
maven.log
target

+ 459
- 0
license.txt View File

@@ -0,0 +1,459 @@
GNU LESSER GENERAL PUBLIC LICENSE
Version 2.1, February 1999

Copyright (C) 1991, 1999 Free Software Foundation, Inc.
59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
Everyone is permitted to copy and distribute verbatim copies
of this license document, but changing it is not allowed.

[This is the first released version of the Lesser GPL. It also counts
as the successor of the GNU Library Public License, version 2, hence
the version number 2.1.]

Preamble

The licenses for most software are designed to take away your
freedom to share and change it. By contrast, the GNU General Public
Licenses are intended to guarantee your freedom to share and change
free software--to make sure the software is free for all its users.

This license, the Lesser General Public License, applies to some
specially designated software packages--typically libraries--of the
Free Software Foundation and other authors who decide to use it. You
can use it too, but we suggest you first think carefully about whether
this license or the ordinary General Public License is the better
strategy to use in any particular case, based on the explanations below.

When we speak of free software, we are referring to freedom of use,
not price. Our General Public Licenses are designed to make sure that
you have the freedom to distribute copies of free software (and charge
for this service if you wish); that you receive source code or can get
it if you want it; that you can change the software and use pieces of
it in new free programs; and that you are informed that you can do
these things.

To protect your rights, we need to make restrictions that forbid
distributors to deny you these rights or to ask you to surrender these
rights. These restrictions translate to certain responsibilities for
you if you distribute copies of the library or if you modify it.

For example, if you distribute copies of the library, whether gratis
or for a fee, you must give the recipients all the rights that we gave
you. You must make sure that they, too, receive or can get the source
code. If you link other code with the library, you must provide
complete object files to the recipients, so that they can relink them
with the library after making changes to the library and recompiling
it. And you must show them these terms so they know their rights.

We protect your rights with a two-step method: (1) we copyright the
library, and (2) we offer you this license, which gives you legal
permission to copy, distribute and/or modify the library.

To protect each distributor, we want to make it very clear that
there is no warranty for the free library. Also, if the library is
modified by someone else and passed on, the recipients should know
that what they have is not the original version, so that the original
author's reputation will not be affected by problems that might be
introduced by others.
Finally, software patents pose a constant threat to the existence of
any free program. We wish to make sure that a company cannot
effectively restrict the users of a free program by obtaining a
restrictive license from a patent holder. Therefore, we insist that
any patent license obtained for a version of the library must be
consistent with the full freedom of use specified in this license.

Most GNU software, including some libraries, is covered by the
ordinary GNU General Public License. This license, the GNU Lesser
General Public License, applies to certain designated libraries, and
is quite different from the ordinary General Public License. We use
this license for certain libraries in order to permit linking those
libraries into non-free programs.

When a program is linked with a library, whether statically or using
a shared library, the combination of the two is legally speaking a
combined work, a derivative of the original library. The ordinary
General Public License therefore permits such linking only if the
entire combination fits its criteria of freedom. The Lesser General
Public License permits more lax criteria for linking other code with
the library.

We call this license the "Lesser" General Public License because it
does Less to protect the user's freedom than the ordinary General
Public License. It also provides other free software developers Less
of an advantage over competing non-free programs. These disadvantages
are the reason we use the ordinary General Public License for many
libraries. However, the Lesser license provides advantages in certain
special circumstances.

For example, on rare occasions, there may be a special need to
encourage the widest possible use of a certain library, so that it becomes
a de-facto standard. To achieve this, non-free programs must be
allowed to use the library. A more frequent case is that a free
library does the same job as widely used non-free libraries. In this
case, there is little to gain by limiting the free library to free
software only, so we use the Lesser General Public License.

In other cases, permission to use a particular library in non-free
programs enables a greater number of people to use a large body of
free software. For example, permission to use the GNU C Library in
non-free programs enables many more people to use the whole GNU
operating system, as well as its variant, the GNU/Linux operating
system.

Although the Lesser General Public License is Less protective of the
users' freedom, it does ensure that the user of a program that is
linked with the Library has the freedom and the wherewithal to run
that program using a modified version of the Library.

The precise terms and conditions for copying, distribution and
modification follow. Pay close attention to the difference between a
"work based on the library" and a "work that uses the library". The
former contains code derived from the library, whereas the latter must
be combined with the library in order to run.
GNU LESSER GENERAL PUBLIC LICENSE
TERMS AND CONDITIONS FOR COPYING, DISTRIBUTION AND MODIFICATION

0. This License Agreement applies to any software library or other
program which contains a notice placed by the copyright holder or
other authorized party saying it may be distributed under the terms of
this Lesser General Public License (also called "this License").
Each licensee is addressed as "you".

A "library" means a collection of software functions and/or data
prepared so as to be conveniently linked with application programs
(which use some of those functions and data) to form executables.

The "Library", below, refers to any such software library or work
which has been distributed under these terms. A "work based on the
Library" means either the Library or any derivative work under
copyright law: that is to say, a work containing the Library or a
portion of it, either verbatim or with modifications and/or translated
straightforwardly into another language. (Hereinafter, translation is
included without limitation in the term "modification".)

"Source code" for a work means the preferred form of the work for
making modifications to it. For a library, complete source code means
all the source code for all modules it contains, plus any associated
interface definition files, plus the scripts used to control compilation
and installation of the library.

Activities other than copying, distribution and modification are not
covered by this License; they are outside its scope. The act of
running a program using the Library is not restricted, and output from
such a program is covered only if its contents constitute a work based
on the Library (independent of the use of the Library in a tool for
writing it). Whether that is true depends on what the Library does
and what the program that uses the Library does.
1. You may copy and distribute verbatim copies of the Library's
complete source code as you receive it, in any medium, provided that
you conspicuously and appropriately publish on each copy an
appropriate copyright notice and disclaimer of warranty; keep intact
all the notices that refer to this License and to the absence of any
warranty; and distribute a copy of this License along with the
Library.

You may charge a fee for the physical act of transferring a copy,
and you may at your option offer warranty protection in exchange for a
fee.
2. You may modify your copy or copies of the Library or any portion
of it, thus forming a work based on the Library, and copy and
distribute such modifications or work under the terms of Section 1
above, provided that you also meet all of these conditions:

a) The modified work must itself be a software library.

b) You must cause the files modified to carry prominent notices
stating that you changed the files and the date of any change.

c) You must cause the whole of the work to be licensed at no
charge to all third parties under the terms of this License.

d) If a facility in the modified Library refers to a function or a
table of data to be supplied by an application program that uses
the facility, other than as an argument passed when the facility
is invoked, then you must make a good faith effort to ensure that,
in the event an application does not supply such function or
table, the facility still operates, and performs whatever part of
its purpose remains meaningful.

(For example, a function in a library to compute square roots has
a purpose that is entirely well-defined independent of the
application. Therefore, Subsection 2d requires that any
application-supplied function or table used by this function must
be optional: if the application does not supply it, the square
root function must still compute square roots.)

These requirements apply to the modified work as a whole. If
identifiable sections of that work are not derived from the Library,
and can be reasonably considered independent and separate works in
themselves, then this License, and its terms, do not apply to those
sections when you distribute them as separate works. But when you
distribute the same sections as part of a whole which is a work based
on the Library, the distribution of the whole must be on the terms of
this License, whose permissions for other licensees extend to the
entire whole, and thus to each and every part regardless of who wrote
it.

Thus, it is not the intent of this section to claim rights or contest
your rights to work written entirely by you; rather, the intent is to
exercise the right to control the distribution of derivative or
collective works based on the Library.

In addition, mere aggregation of another work not based on the Library
with the Library (or with a work based on the Library) on a volume of
a storage or distribution medium does not bring the other work under
the scope of this License.

3. You may opt to apply the terms of the ordinary GNU General Public
License instead of this License to a given copy of the Library. To do
this, you must alter all the notices that refer to this License, so
that they refer to the ordinary GNU General Public License, version 2,
instead of to this License. (If a newer version than version 2 of the
ordinary GNU General Public License has appeared, then you can specify
that version instead if you wish.) Do not make any other change in
these notices.
Once this change is made in a given copy, it is irreversible for
that copy, so the ordinary GNU General Public License applies to all
subsequent copies and derivative works made from that copy.

This option is useful when you wish to copy part of the code of
the Library into a program that is not a library.

4. You may copy and distribute the Library (or a portion or
derivative of it, under Section 2) in object code or executable form
under the terms of Sections 1 and 2 above provided that you accompany
it with the complete corresponding machine-readable source code, which
must be distributed under the terms of Sections 1 and 2 above on a
medium customarily used for software interchange.

If distribution of object code is made by offering access to copy
from a designated place, then offering equivalent access to copy the
source code from the same place satisfies the requirement to
distribute the source code, even though third parties are not
compelled to copy the source along with the object code.

5. A program that contains no derivative of any portion of the
Library, but is designed to work with the Library by being compiled or
linked with it, is called a "work that uses the Library". Such a
work, in isolation, is not a derivative work of the Library, and
therefore falls outside the scope of this License.

However, linking a "work that uses the Library" with the Library
creates an executable that is a derivative of the Library (because it
contains portions of the Library), rather than a "work that uses the
library". The executable is therefore covered by this License.
Section 6 states terms for distribution of such executables.

When a "work that uses the Library" uses material from a header file
that is part of the Library, the object code for the work may be a
derivative work of the Library even though the source code is not.
Whether this is true is especially significant if the work can be
linked without the Library, or if the work is itself a library. The
threshold for this to be true is not precisely defined by law.

If such an object file uses only numerical parameters, data
structure layouts and accessors, and small macros and small inline
functions (ten lines or less in length), then the use of the object
file is unrestricted, regardless of whether it is legally a derivative
work. (Executables containing this object code plus portions of the
Library will still fall under Section 6.)

Otherwise, if the work is a derivative of the Library, you may
distribute the object code for the work under the terms of Section 6.
Any executables containing that work also fall under Section 6,
whether or not they are linked directly with the Library itself.
6. As an exception to the Sections above, you may also combine or
link a "work that uses the Library" with the Library to produce a
work containing portions of the Library, and distribute that work
under terms of your choice, provided that the terms permit
modification of the work for the customer's own use and reverse
engineering for debugging such modifications.

You must give prominent notice with each copy of the work that the
Library is used in it and that the Library and its use are covered by
this License. You must supply a copy of this License. If the work
during execution displays copyright notices, you must include the
copyright notice for the Library among them, as well as a reference
directing the user to the copy of this License. Also, you must do one
of these things:

a) Accompany the work with the complete corresponding
machine-readable source code for the Library including whatever
changes were used in the work (which must be distributed under
Sections 1 and 2 above); and, if the work is an executable linked
with the Library, with the complete machine-readable "work that
uses the Library", as object code and/or source code, so that the
user can modify the Library and then relink to produce a modified
executable containing the modified Library. (It is understood
that the user who changes the contents of definitions files in the
Library will not necessarily be able to recompile the application
to use the modified definitions.)

b) Use a suitable shared library mechanism for linking with the
Library. A suitable mechanism is one that (1) uses at run time a
copy of the library already present on the user's computer system,
rather than copying library functions into the executable, and (2)
will operate properly with a modified version of the library, if
the user installs one, as long as the modified version is
interface-compatible with the version that the work was made with.

c) Accompany the work with a written offer, valid for at
least three years, to give the same user the materials
specified in Subsection 6a, above, for a charge no more
than the cost of performing this distribution.

d) If distribution of the work is made by offering access to copy
from a designated place, offer equivalent access to copy the above
specified materials from the same place.

e) Verify that the user has already received a copy of these
materials or that you have already sent this user a copy.

For an executable, the required form of the "work that uses the
Library" must include any data and utility programs needed for
reproducing the executable from it. However, as a special exception,
the materials to be distributed need not include anything that is
normally distributed (in either source or binary form) with the major
components (compiler, kernel, and so on) of the operating system on
which the executable runs, unless that component itself accompanies
the executable.

It may happen that this requirement contradicts the license
restrictions of other proprietary libraries that do not normally
accompany the operating system. Such a contradiction means you cannot
use both them and the Library together in an executable that you
distribute.
7. You may place library facilities that are a work based on the
Library side-by-side in a single library together with other library
facilities not covered by this License, and distribute such a combined
library, provided that the separate distribution of the work based on
the Library and of the other library facilities is otherwise
permitted, and provided that you do these two things:

a) Accompany the combined library with a copy of the same work
based on the Library, uncombined with any other library
facilities. This must be distributed under the terms of the
Sections above.

b) Give prominent notice with the combined library of the fact
that part of it is a work based on the Library, and explaining
where to find the accompanying uncombined form of the same work.

8. You may not copy, modify, sublicense, link with, or distribute
the Library except as expressly provided under this License. Any
attempt otherwise to copy, modify, sublicense, link with, or
distribute the Library is void, and will automatically terminate your
rights under this License. However, parties who have received copies,
or rights, from you under this License will not have their licenses
terminated so long as such parties remain in full compliance.

9. You are not required to accept this License, since you have not
signed it. However, nothing else grants you permission to modify or
distribute the Library or its derivative works. These actions are
prohibited by law if you do not accept this License. Therefore, by
modifying or distributing the Library (or any work based on the
Library), you indicate your acceptance of this License to do so, and
all its terms and conditions for copying, distributing or modifying
the Library or works based on it.

10. Each time you redistribute the Library (or any work based on the
Library), the recipient automatically receives a license from the
original licensor to copy, distribute, link with or modify the Library
subject to these terms and conditions. You may not impose any further
restrictions on the recipients' exercise of the rights granted herein.
You are not responsible for enforcing compliance by third parties with
this License.
11. If, as a consequence of a court judgment or allegation of patent
infringement or for any other reason (not limited to patent issues),
conditions are imposed on you (whether by court order, agreement or
otherwise) that contradict the conditions of this License, they do not
excuse you from the conditions of this License. If you cannot
distribute so as to satisfy simultaneously your obligations under this
License and any other pertinent obligations, then as a consequence you
may not distribute the Library at all. For example, if a patent
license would not permit royalty-free redistribution of the Library by
all those who receive copies directly or indirectly through you, then
the only way you could satisfy both it and this License would be to
refrain entirely from distribution of the Library.

If any portion of this section is held invalid or unenforceable under any
particular circumstance, the balance of the section is intended to apply,
and the section as a whole is intended to apply in other circumstances.

It is not the purpose of this section to induce you to infringe any
patents or other property right claims or to contest validity of any
such claims; this section has the sole purpose of protecting the
integrity of the free software distribution system which is
implemented by public license practices. Many people have made
generous contributions to the wide range of software distributed
through that system in reliance on consistent application of that
system; it is up to the author/donor to decide if he or she is willing
to distribute software through any other system and a licensee cannot
impose that choice.

This section is intended to make thoroughly clear what is believed to
be a consequence of the rest of this License.

12. If the distribution and/or use of the Library is restricted in
certain countries either by patents or by copyrighted interfaces, the
original copyright holder who places the Library under this License may add
an explicit geographical distribution limitation excluding those countries,
so that distribution is permitted only in or among countries not thus
excluded. In such case, this License incorporates the limitation as if
written in the body of this License.

13. The Free Software Foundation may publish revised and/or new
versions of the Lesser General Public License from time to time.
Such new versions will be similar in spirit to the present version,
but may differ in detail to address new problems or concerns.

Each version is given a distinguishing version number. If the Library
specifies a version number of this License which applies to it and
"any later version", you have the option of following the terms and
conditions either of that version or of any later version published by
the Free Software Foundation. If the Library does not specify a
license version number, you may choose any version ever published by
the Free Software Foundation.
14. If you wish to incorporate parts of the Library into other free
programs whose distribution conditions are incompatible with these,
write to the author to ask for permission. For software which is
copyrighted by the Free Software Foundation, write to the Free
Software Foundation; we sometimes make exceptions for this. Our
decision will be guided by the two goals of preserving the free status
of all derivatives of our free software and of promoting the sharing
and reuse of software generally.

NO WARRANTY

15. BECAUSE THE LIBRARY IS LICENSED FREE OF CHARGE, THERE IS NO
WARRANTY FOR THE LIBRARY, TO THE EXTENT PERMITTED BY APPLICABLE LAW.
EXCEPT WHEN OTHERWISE STATED IN WRITING THE COPYRIGHT HOLDERS AND/OR
OTHER PARTIES PROVIDE THE LIBRARY "AS IS" WITHOUT WARRANTY OF ANY
KIND, EITHER EXPRESSED OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE
IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
PURPOSE. THE ENTIRE RISK AS TO THE QUALITY AND PERFORMANCE OF THE
LIBRARY IS WITH YOU. SHOULD THE LIBRARY PROVE DEFECTIVE, YOU ASSUME
THE COST OF ALL NECESSARY SERVICING, REPAIR OR CORRECTION.

16. IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN
WRITING WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MAY MODIFY
AND/OR REDISTRIBUTE THE LIBRARY AS PERMITTED ABOVE, BE LIABLE TO YOU
FOR DAMAGES, INCLUDING ANY GENERAL, SPECIAL, INCIDENTAL OR
CONSEQUENTIAL DAMAGES ARISING OUT OF THE USE OR INABILITY TO USE THE
LIBRARY (INCLUDING BUT NOT LIMITED TO LOSS OF DATA OR DATA BEING
RENDERED INACCURATE OR LOSSES SUSTAINED BY YOU OR THIRD PARTIES OR A
FAILURE OF THE LIBRARY TO OPERATE WITH ANY OTHER SOFTWARE), EVEN IF
SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE POSSIBILITY OF SUCH
DAMAGES.

END OF TERMS AND CONDITIONS

+ 18
- 0
project.properties View File

@@ -0,0 +1,18 @@
maven.artifact.legacy=false
maven.changes.issue.template=http://sf.net/tracker/index.php?func=detail&aid=%ISSUE%&group_id=134943&atid=731445
maven.compile.source=1.5
maven.compile.target=1.5
maven.javadoc.excludepackagenames=com.healthmarketscience.jackcess.scsu
maven.javadoc.links=http://java.sun.com/j2se/1.5.0/docs/api
maven.javadoc.package=false
maven.javadoc.public=true
maven.javadoc.source=1.5
maven.junit.fork=on
maven.junit.jvmargs=-Xmx256M -server
maven.junit.sysproperties=log4j.configuration
maven.repo.remote=http://www.ibiblio.org/maven,http://maven-plugins.sf.net/maven
maven.sourceforge.project.groupId=134943
maven.sourceforge.username=javajedi
maven.test.source=1.5
log4j.configuration=com/hmsonline/common/access/log4j.properties
statcvs.include=**/*.java;**/*.xml

+ 97
- 0
project.xml View File

@@ -0,0 +1,97 @@
<?xml version="1.0" encoding="ISO-8859-1"?>
<project>
<pomVersion>1</pomVersion>
<id>jackcess</id>
<name>Jackcess</name>
<currentVersion>1.0</currentVersion>
<organization>
<name>Health Market Science, Inc.</name>
<url>http://www.healthmarketscience.com</url>
<logo>http://www.healthmarketscience.com/images/logo_red.jpg</logo>
</organization>
<inceptionYear>2005</inceptionYear>
<package>com.healthmarketscience.jackcess</package>
<description>A pure Java library for reading from and writing to MS Access databases.</description>
<url>http://jackcess.sf.net</url>
<issueTrackingUrl>http://sf.net/tracker/?group_id=134943&amp;atid=731445</issueTrackingUrl>
<siteAddress>jackcess.sf.net</siteAddress>
<siteDirectory>/home/groups/j/ja/jackcess/htdocs</siteDirectory>
<repository>
<connection>scm:cvs:pserver:anonymous@cvs.sf.net:/cvsroot/jackcess:jackcess</connection>
<url>http://cvs.sf.net/viewcvs.py/jackcess/</url>
</repository>
<mailingLists>
<mailingList>
<name>jackcess-users</name>
<subscribe>http://lists.sf.net/lists/listinfo/jackcess-users</subscribe>
<unsubscribe>http://lists.sf.net/lists/listinfo/jackcess-users</unsubscribe>
<archive>http://sf.net/mailarchive/forum.php?forum=jackcess-users</archive>
</mailingList>
</mailingLists>
<developers>
<developer>
<name>Tim McCune</name>
<id>javajedi</id>
<email>javajedi@users.sf.net</email>
<organization>Health Market Science, Inc.</organization>
<timezone>-5</timezone>
</developer>
</developers>
<licenses>
<license>
<name>GNU Lesser General Public License</name>
<url>http://www.gnu.org/copyleft/lesser.txt</url>
<distribution>manual</distribution>
</license>
</licenses>
<build>
<sourceDirectory>src/java</sourceDirectory>
<unitTestSourceDirectory>test/src/java</unitTestSourceDirectory>
<resources>
<resource>
<directory>src/resources</directory>
</resource>
</resources>
</build>
<dependencies>
<dependency>
<groupId>commons-collections</groupId>
<artifactId>commons-collections</artifactId>
<version>3.0</version>
</dependency>
<dependency>
<groupId>commons-lang</groupId>
<artifactId>commons-lang</artifactId>
<version>2.0</version>
</dependency>
<dependency>
<groupId>commons-logging</groupId>
<artifactId>commons-logging</artifactId>
<version>1.0.3</version>
</dependency>
<dependency>
<groupId>log4j</groupId>
<artifactId>log4j</artifactId>
<version>1.2.7</version>
</dependency>
<dependency>
<groupId>maven-plugins</groupId>
<artifactId>maven-sourceforge-plugin</artifactId>
<version>1.1</version>
<type>plugin</type>
</dependency>
<dependency>
<groupId>statcvs</groupId>
<artifactId>maven-statcvs-plugin</artifactId>
<version>2.5</version>
<type>plugin</type>
</dependency>
</dependencies>
<reports>
<report>maven-faq-plugin</report>
<report>maven-javadoc-plugin</report>
<report>maven-jxr-plugin</report>
<report>maven-jdepend-plugin</report>
<report>maven-statcvs-plugin</report>
</reports>
</project>

+ 114
- 0
src/java/com/healthmarketscience/jackcess/ByteUtil.java View File

@@ -0,0 +1,114 @@
/*
Copyright (c) 2005 Health Market Science, Inc.

This library is free software; you can redistribute it and/or
modify it under the terms of the GNU Lesser General Public
License as published by the Free Software Foundation; either
version 2.1 of the License, or (at your option) any later version.

This library is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
Lesser General Public License for more details.

You should have received a copy of the GNU Lesser General Public
License along with this library; if not, write to the Free Software
Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307
USA

You can contact Health Market Science at info@healthmarketscience.com
or at the following address:

Health Market Science
2700 Horizon Drive
Suite 200
King of Prussia, PA 19406
*/

package com.healthmarketscience.jackcess;

import java.nio.ByteBuffer;

/**
* Byte manipulation and display utilities
* @author Tim McCune
*/
public final class ByteUtil {
private static final String[] HEX_CHARS = new String[] {
"0", "1", "2", "3", "4", "5", "6", "7",
"8", "9", "A", "B", "C", "D", "E", "F"};
private ByteUtil() {}
/**
* Convert an int from 4 bytes to 3
* @param i Int to convert
* @return Array of 3 bytes in little-endian order
*/
public static byte[] to3ByteInt(int i) {
byte[] rtn = new byte[3];
rtn[0] = (byte) (i & 0xFF);
rtn[1] = (byte) ((i >>> 8) & 0xFF);
rtn[2] = (byte) ((i >>> 16) & 0xFF);
return rtn;
}
/**
* Read a 3 byte int from a buffer in little-endian order
* @param buffer Buffer containing the bytes
* @param offset Offset at which to start reading the int
* @return The int
*/
public static int get3ByteInt(ByteBuffer buffer, int offset) {
int rtn = buffer.get(offset) & 0xff;
rtn += ((((int) buffer.get(offset + 1)) & 0xFF) << 8);
rtn += ((((int) buffer.get(offset + 2)) & 0xFF) << 16);
rtn &= 16777215; //2 ^ (8 * 3) - 1
return rtn;
}
/**
* Convert a byte buffer to a hexadecimal string for display
* @param buffer Buffer to display, starting at offset 0
* @param size Number of bytes to read from the buffer
* @return The display String
*/
public static String toHexString(ByteBuffer buffer, int size) {
return toHexString(buffer, 0, size);
}
/**
* Convert a byte buffer to a hexadecimal string for display
* @param buffer Buffer to display, starting at offset 0
* @param offset Offset at which to start reading the buffer
* @param size Number of bytes to read from the buffer
* @return The display String
*/
public static String toHexString(ByteBuffer buffer, int offset, int size) {
StringBuffer rtn = new StringBuffer();
int position = buffer.position();
buffer.position(offset);
for (int i = 0; i < size; i++) {
byte b = buffer.get();
byte h = (byte) (b & 0xF0);
h = (byte) (h >>> 4);
h = (byte) (h & 0x0F);
rtn.append(HEX_CHARS[(int) h]);
h = (byte) (b & 0x0F);
rtn.append(HEX_CHARS[(int) h] + " ");
if ((i + 1) % 4 == 0) {
rtn.append(" ");
}
if ((i + 1) % 24 == 0) {
rtn.append("\n");
}
}
buffer.position(position);
return rtn.toString();
}
}

+ 549
- 0
src/java/com/healthmarketscience/jackcess/Column.java View File

@@ -0,0 +1,549 @@
/*
Copyright (c) 2005 Health Market Science, Inc.

This library is free software; you can redistribute it and/or
modify it under the terms of the GNU Lesser General Public
License as published by the Free Software Foundation; either
version 2.1 of the License, or (at your option) any later version.

This library is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
Lesser General Public License for more details.

You should have received a copy of the GNU Lesser General Public
License along with this library; if not, write to the Free Software
Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307
USA

You can contact Health Market Science at info@healthmarketscience.com
or at the following address:

Health Market Science
2700 Horizon Drive
Suite 200
King of Prussia, PA 19406
*/

package com.healthmarketscience.jackcess;

import java.io.IOException;
import java.nio.ByteBuffer;
import java.nio.ByteOrder;
import java.nio.CharBuffer;
import java.sql.SQLException;
import java.util.Calendar;
import java.util.Date;
import java.util.Iterator;
import java.util.List;
import java.util.TimeZone;

import com.healthmarketscience.jackcess.scsu.EndOfInputException;
import com.healthmarketscience.jackcess.scsu.Expand;
import com.healthmarketscience.jackcess.scsu.IllegalInputException;

import org.apache.commons.logging.Log;
import org.apache.commons.logging.LogFactory;

/**
* Access database column definition
* @author Tim McCune
*/
public class Column implements Comparable {
private static final Log LOG = LogFactory.getLog(Column.class);
/**
* Access starts counting dates at Jan 1, 1900. Java starts counting
* at Jan 1, 1970. This is the # of days between them for conversion.
*/
private static final double DAYS_BETWEEN_EPOCH_AND_1900 = 25569d;
/**
* Access stores numeric dates in days. Java stores them in milliseconds.
*/
private static final double MILLISECONDS_PER_DAY = 86400000d;
/**
* Long value (LVAL) type that indicates that the value is stored on the same page
*/
private static final short LONG_VALUE_TYPE_THIS_PAGE = (short) 0x8000;
/**
* Long value (LVAL) type that indicates that the value is stored on another page
*/
private static final short LONG_VALUE_TYPE_OTHER_PAGE = (short) 0x4000;
/**
* Long value (LVAL) type that indicates that the value is stored on multiple other pages
*/
private static final short LONG_VALUE_TYPE_OTHER_PAGES = (short) 0x0;
/** For text columns, whether or not they are compressed */
private boolean _compressedUnicode = false;
/** Whether or not the column is of variable length */
private boolean _variableLength;
/** Numeric precision */
private byte _precision;
/** Numeric scale */
private byte _scale;
/** Data type */
private byte _type;
/** Format that the containing database is in */
private JetFormat _format;
/** Used to read in LVAL pages */
private PageChannel _pageChannel;
/** Maximum column length */
private short _columnLength;
/** 0-based column number */
private short _columnNumber;
/** Column name */
private String _name;
public Column() {
this(JetFormat.VERSION_4);
}
public Column(JetFormat format) {
_format = format;
}
/**
* Read a column definition in from a buffer
* @param buffer Buffer containing column definition
* @param offset Offset in the buffer at which the column definition starts
* @param format Format that the containing database is in
*/
public Column(ByteBuffer buffer, int offset, PageChannel pageChannel, JetFormat format) {
if (LOG.isDebugEnabled()) {
LOG.debug("Column def block:\n" + ByteUtil.toHexString(buffer, offset, 25));
}
_pageChannel = pageChannel;
_format = format;
setType(buffer.get(offset + format.OFFSET_COLUMN_TYPE));
_columnNumber = buffer.getShort(offset + format.OFFSET_COLUMN_NUMBER);
_columnLength = buffer.getShort(offset + format.OFFSET_COLUMN_LENGTH);
if (_type == DataTypes.NUMERIC) {
_precision = buffer.get(offset + format.OFFSET_COLUMN_PRECISION);
_scale = buffer.get(offset + format.OFFSET_COLUMN_SCALE);
}
_variableLength = ((buffer.get(offset + format.OFFSET_COLUMN_VARIABLE)
& 1) != 1);
_compressedUnicode = ((buffer.get(offset +
format.OFFSET_COLUMN_COMPRESSED_UNICODE) & 1) == 1);
}
public String getName() {
return _name;
}
public void setName(String name) {
_name = name;
}
public boolean isVariableLength() {
return _variableLength;
}
public void setVariableLength(boolean variableLength) {
_variableLength = variableLength;
}
public short getColumnNumber() {
return _columnNumber;
}
/**
* Also sets the length and the variable length flag, inferred from the type
*/
public void setType(byte type) {
_type = type;
setLength((short) size());
switch (type) {
case DataTypes.BOOLEAN:
case DataTypes.BYTE:
case DataTypes.INT:
case DataTypes.LONG:
case DataTypes.DOUBLE:
case DataTypes.FLOAT:
case DataTypes.SHORT_DATE_TIME:
setVariableLength(false);
break;
case DataTypes.BINARY:
case DataTypes.TEXT:
setVariableLength(true);
break;
}
}
public byte getType() {
return _type;
}
public int getSQLType() throws SQLException {
return DataTypes.toSQLType(_type);
}
public void setSQLType(int type) throws SQLException {
setType(DataTypes.fromSQLType(type));
}
public boolean isCompressedUnicode() {
return _compressedUnicode;
}
public byte getPrecision() {
return _precision;
}
public byte getScale() {
return _scale;
}
public void setLength(short length) {
_columnLength = length;
}
public short getLength() {
return _columnLength;
}
/**
* Deserialize a raw byte value for this column into an Object
* @param data The raw byte value
* @return The deserialized Object
*/
public Object read(byte[] data) throws IOException {
return read(data, ByteOrder.LITTLE_ENDIAN);
}
/**
* Deserialize a raw byte value for this column into an Object
* @param data The raw byte value
* @param order Byte order in which the raw value is stored
* @return The deserialized Object
*/
public Object read(byte[] data, ByteOrder order) throws IOException {
ByteBuffer buffer = ByteBuffer.wrap(data);
buffer.order(order);
switch (_type) {
case DataTypes.BOOLEAN:
throw new IOException("Tried to read a boolean from data instead of null mask.");
case DataTypes.BYTE:
return new Byte(buffer.get());
case DataTypes.INT:
return new Short(buffer.getShort());
case DataTypes.LONG:
return new Integer(buffer.getInt());
case DataTypes.DOUBLE:
return new Double(buffer.getDouble());
case DataTypes.FLOAT:
return new Float(buffer.getFloat());
case DataTypes.SHORT_DATE_TIME:
long time = (long) ((buffer.getDouble() - DAYS_BETWEEN_EPOCH_AND_1900) *
MILLISECONDS_PER_DAY);
Calendar cal = Calendar.getInstance(TimeZone.getTimeZone("GMT"));
cal.setTimeInMillis(time);
//Not sure why we're off by 1...
cal.add(Calendar.DATE, 1);
return cal.getTime();
case DataTypes.BINARY:
return data;
case DataTypes.TEXT:
if (_compressedUnicode) {
try {
String rtn = new Expand().expand(data);
//SCSU expander isn't handling the UTF-8-looking 2-byte combo that
//prepends some of these strings. Rather than dig into that code,
//I'm just stripping them off here. However, this is probably not
//a great idea.
if (rtn.length() > 2 && (int) rtn.charAt(0) == 255 &&
(int) rtn.charAt(1) == 254)
{
rtn = rtn.substring(2);
}
//It also isn't handling short strings.
if (rtn.length() > 1 && (int) rtn.charAt(1) == 0) {
char[] fixed = new char[rtn.length() / 2];
for (int i = 0; i < fixed.length; i ++) {
fixed[i] = rtn.charAt(i * 2);
}
rtn = new String(fixed);
}
return rtn;
} catch (IllegalInputException e) {
throw new IOException("Can't expand text column");
} catch (EndOfInputException e) {
throw new IOException("Can't expand text column");
}
} else {
return _format.CHARSET.decode(ByteBuffer.wrap(data)).toString();
}
case DataTypes.MONEY:
//XXX
return null;
case DataTypes.OLE:
if (data.length > 0) {
return getLongValue(data);
} else {
return null;
}
case DataTypes.MEMO:
if (data.length > 0) {
return _format.CHARSET.decode(ByteBuffer.wrap(getLongValue(data))).toString();
} else {
return null;
}
case DataTypes.NUMERIC:
//XXX
return null;
case DataTypes.UNKNOWN_0D:
case DataTypes.GUID:
return null;
default:
throw new IOException("Unrecognized data type: " + _type);
}
}
/**
* @param lvalDefinition Column value that points to an LVAL record
* @return The LVAL data
*/
private byte[] getLongValue(byte[] lvalDefinition) throws IOException {
ByteBuffer def = ByteBuffer.wrap(lvalDefinition);
def.order(ByteOrder.LITTLE_ENDIAN);
short length = def.getShort();
byte[] rtn = new byte[length];
short type = def.getShort();
switch (type) {
case LONG_VALUE_TYPE_OTHER_PAGE:
if (lvalDefinition.length != _format.SIZE_LONG_VALUE_DEF) {
throw new IOException("Expected " + _format.SIZE_LONG_VALUE_DEF +
" bytes in long value definition, but found " + lvalDefinition.length);
}
byte rowNum = def.get();
int pageNum = ByteUtil.get3ByteInt(def, def.position());
ByteBuffer lvalPage = _pageChannel.createPageBuffer();
_pageChannel.readPage(lvalPage, pageNum);
short offset = lvalPage.getShort(14 +
rowNum * _format.SIZE_ROW_LOCATION);
lvalPage.position(offset);
lvalPage.get(rtn);
break;
case LONG_VALUE_TYPE_THIS_PAGE:
def.getLong(); //Skip over lval_dp and unknown
def.get(rtn);
case LONG_VALUE_TYPE_OTHER_PAGES:
//XXX
return null;
default:
throw new IOException("Unrecognized long value type: " + type);
}
return rtn;
}
/**
* Write an LVAL column into a ByteBuffer inline (LONG_VALUE_TYPE_THIS_PAGE)
* @param value Value of the LVAL column
* @return A buffer containing the LVAL definition and the column value
*/
public ByteBuffer writeLongValue(byte[] value) throws IOException {
ByteBuffer def = ByteBuffer.allocate(_format.SIZE_LONG_VALUE_DEF + value.length);
def.order(ByteOrder.LITTLE_ENDIAN);
def.putShort((short) value.length);
def.putShort(LONG_VALUE_TYPE_THIS_PAGE);
def.putInt(0);
def.putInt(0); //Unknown
def.put(value);
def.flip();
return def;
}
/**
* Write an LVAL column into a ByteBuffer on another page
* (LONG_VALUE_TYPE_OTHER_PAGE)
* @param value Value of the LVAL column
* @return A buffer containing the LVAL definition
*/
public ByteBuffer writeLongValueInNewPage(byte[] value) throws IOException {
ByteBuffer lvalPage = _pageChannel.createPageBuffer();
lvalPage.put(PageTypes.DATA); //Page type
lvalPage.put((byte) 1); //Unknown
lvalPage.putShort((short) (_format.PAGE_SIZE -
_format.OFFSET_LVAL_ROW_LOCATION_BLOCK - _format.SIZE_ROW_LOCATION -
value.length)); //Free space
lvalPage.put((byte) 'L');
lvalPage.put((byte) 'V');
lvalPage.put((byte) 'A');
lvalPage.put((byte) 'L');
int offset = _format.PAGE_SIZE - value.length;
lvalPage.position(14);
lvalPage.putShort((short) offset);
lvalPage.position(offset);
lvalPage.put(value);
ByteBuffer def = ByteBuffer.allocate(_format.SIZE_LONG_VALUE_DEF);
def.order(ByteOrder.LITTLE_ENDIAN);
def.putShort((short) value.length);
def.putShort(LONG_VALUE_TYPE_OTHER_PAGE);
def.put((byte) 0); //Row number
def.put(ByteUtil.to3ByteInt(_pageChannel.writeNewPage(lvalPage))); //Page #
def.putInt(0); //Unknown
def.flip();
return def;
}
/**
* Serialize an Object into a raw byte value for this column in little endian order
* @param obj Object to serialize
* @return A buffer containing the bytes
*/
public ByteBuffer write(Object obj) throws IOException {
return write(obj, ByteOrder.LITTLE_ENDIAN);
}
/**
* Serialize an Object into a raw byte value for this column
* @param obj Object to serialize
* @param order Order in which to serialize
* @return A buffer containing the bytes
*/
public ByteBuffer write(Object obj, ByteOrder order) throws IOException {
int size = size();
if (_type == DataTypes.OLE || _type == DataTypes.MEMO) {
size += ((byte[]) obj).length;
}
if (_type == DataTypes.TEXT) {
size = getLength();
}
ByteBuffer buffer = ByteBuffer.allocate(size);
buffer.order(order);
switch (_type) {
case DataTypes.BOOLEAN:
break;
case DataTypes.BYTE:
buffer.put(((Byte) obj).byteValue());
break;
case DataTypes.INT:
buffer.putShort(((Short) obj).shortValue());
break;
case DataTypes.LONG:
buffer.putInt(((Integer) obj).intValue());
break;
case DataTypes.DOUBLE:
buffer.putDouble(((Double) obj).doubleValue());
break;
case DataTypes.FLOAT:
buffer.putFloat(((Float) obj).floatValue());
break;
case DataTypes.SHORT_DATE_TIME:
Calendar cal = Calendar.getInstance();
cal.setTime((Date) obj);
long ms = cal.getTimeInMillis();
ms += (long) TimeZone.getDefault().getOffset(ms);
buffer.putDouble((double) ms / MILLISECONDS_PER_DAY +
DAYS_BETWEEN_EPOCH_AND_1900);
break;
case DataTypes.BINARY:
buffer.put((byte[]) obj);
break;
case DataTypes.TEXT:
CharSequence text = (CharSequence) obj;
int maxChars = size / 2;
if (text.length() > maxChars) {
text = text.subSequence(0, maxChars);
}
buffer.put(encodeText(text));
break;
case DataTypes.OLE:
buffer.put(writeLongValue((byte[]) obj));
break;
case DataTypes.MEMO:
buffer.put(writeLongValue(encodeText((CharSequence) obj).array()));
break;
default:
throw new IOException("Unsupported data type: " + _type);
}
buffer.flip();
return buffer;
}
/**
* @param text Text to encode
* @return A buffer with the text encoded
*/
private ByteBuffer encodeText(CharSequence text) {
return _format.CHARSET.encode(CharBuffer.wrap(text));
}
/**
* @return Number of bytes that should be read for this column
* (applies to fixed-width columns)
*/
public int size() {
switch (_type) {
case DataTypes.BOOLEAN:
return 0;
case DataTypes.BYTE:
return 1;
case DataTypes.INT:
return 2;
case DataTypes.LONG:
return 4;
case DataTypes.MONEY:
case DataTypes.DOUBLE:
return 8;
case DataTypes.FLOAT:
return 4;
case DataTypes.SHORT_DATE_TIME:
return 8;
case DataTypes.BINARY:
return 255;
case DataTypes.TEXT:
return 50 * 2;
case DataTypes.OLE:
return _format.SIZE_LONG_VALUE_DEF;
case DataTypes.MEMO:
return _format.SIZE_LONG_VALUE_DEF;
case DataTypes.NUMERIC:
throw new IllegalArgumentException("FIX ME");
case DataTypes.UNKNOWN_0D:
case DataTypes.GUID:
throw new IllegalArgumentException("FIX ME");
default:
throw new IllegalArgumentException("Unrecognized data type: " + _type);
}
}
public String toString() {
StringBuffer rtn = new StringBuffer();
rtn.append("\tName: " + _name);
rtn.append("\n\tType: 0x" + Integer.toHexString((int)_type));
rtn.append("\n\tNumber: " + _columnNumber);
rtn.append("\n\tLength: " + _columnLength);
rtn.append("\n\tVariable length: " + _variableLength);
rtn.append("\n\tCompressed Unicode: " + _compressedUnicode);
rtn.append("\n\n");
return rtn.toString();
}
public int compareTo(Object obj) {
Column other = (Column) obj;
if (_columnNumber > other.getColumnNumber()) {
return 1;
} else if (_columnNumber < other.getColumnNumber()) {
return -1;
} else {
return 0;
}
}
/**
* @param columns A list of columns in a table definition
* @return The number of variable length columns found in the list
*/
public static short countVariableLength(List columns) {
short rtn = 0;
Iterator iter = columns.iterator();
while (iter.hasNext()) {
Column col = (Column) iter.next();
if (col.isVariableLength()) {
rtn++;
}
}
return rtn;
}
}

+ 94
- 0
src/java/com/healthmarketscience/jackcess/DataTypes.java View File

@@ -0,0 +1,94 @@
/*
Copyright (c) 2005 Health Market Science, Inc.

This library is free software; you can redistribute it and/or
modify it under the terms of the GNU Lesser General Public
License as published by the Free Software Foundation; either
version 2.1 of the License, or (at your option) any later version.

This library is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
Lesser General Public License for more details.

You should have received a copy of the GNU Lesser General Public
License along with this library; if not, write to the Free Software
Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307
USA

You can contact Health Market Science at info@healthmarketscience.com
or at the following address:

Health Market Science
2700 Horizon Drive
Suite 200
King of Prussia, PA 19406
*/

package com.healthmarketscience.jackcess;

import java.sql.SQLException;
import java.sql.Types;
import org.apache.commons.collections.bidimap.DualHashBidiMap;
import org.apache.commons.collections.BidiMap;

/**
* Access data types
* @author Tim McCune
*/
public final class DataTypes {
public static final byte BOOLEAN = 0x01;
public static final byte BYTE = 0x02;
public static final byte INT = 0x03;
public static final byte LONG = 0x04;
public static final byte MONEY = 0x05;
public static final byte FLOAT = 0x06;
public static final byte DOUBLE = 0x07;
public static final byte SHORT_DATE_TIME = 0x08;
public static final byte BINARY = 0x09;
public static final byte TEXT = 0x0A;
public static final byte OLE = 0x0B;
public static final byte MEMO = 0x0C;
public static final byte UNKNOWN_0D = 0x0D;
public static final byte GUID = 0x0F;
public static final byte NUMERIC = 0x10;
/** Map of Access data types to SQL data types */
private static BidiMap SQL_TYPES = new DualHashBidiMap();
static {
SQL_TYPES.put(new Byte(BOOLEAN), new Integer(Types.BOOLEAN));
SQL_TYPES.put(new Byte(BYTE), new Integer(Types.TINYINT));
SQL_TYPES.put(new Byte(INT), new Integer(Types.SMALLINT));
SQL_TYPES.put(new Byte(LONG), new Integer(Types.INTEGER));
SQL_TYPES.put(new Byte(MONEY), new Integer(Types.DECIMAL));
SQL_TYPES.put(new Byte(FLOAT), new Integer(Types.FLOAT));
SQL_TYPES.put(new Byte(DOUBLE), new Integer(Types.DOUBLE));
SQL_TYPES.put(new Byte(SHORT_DATE_TIME), new Integer(Types.TIMESTAMP));
SQL_TYPES.put(new Byte(BINARY), new Integer(Types.BINARY));
SQL_TYPES.put(new Byte(TEXT), new Integer(Types.VARCHAR));
SQL_TYPES.put(new Byte(OLE), new Integer(Types.LONGVARBINARY));
SQL_TYPES.put(new Byte(MEMO), new Integer(Types.LONGVARCHAR));
}
private DataTypes() {}
public static int toSQLType(byte dataType) throws SQLException {
Integer i = (Integer) SQL_TYPES.get(new Byte(dataType));
if (i != null) {
return i.intValue();
} else {
throw new SQLException("Unsupported data type: " + dataType);
}
}
public static byte fromSQLType(int sqlType) throws SQLException {
Byte b = (Byte) SQL_TYPES.getKey(new Integer(sqlType));
if (b != null) {
return b.byteValue();
} else {
throw new SQLException("Unsupported SQL type: " + sqlType);
}
}
}

+ 717
- 0
src/java/com/healthmarketscience/jackcess/Database.java View File

@@ -0,0 +1,717 @@
/*
Copyright (c) 2005 Health Market Science, Inc.

This library is free software; you can redistribute it and/or
modify it under the terms of the GNU Lesser General Public
License as published by the Free Software Foundation; either
version 2.1 of the License, or (at your option) any later version.

This library is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
Lesser General Public License for more details.

You should have received a copy of the GNU Lesser General Public
License along with this library; if not, write to the Free Software
Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307
USA

You can contact Health Market Science at info@healthmarketscience.com
or at the following address:

Health Market Science
2700 Horizon Drive
Suite 200
King of Prussia, PA 19406
*/

package com.healthmarketscience.jackcess;

import java.io.BufferedReader;
import java.io.File;
import java.io.FileNotFoundException;
import java.io.FileReader;
import java.io.IOException;
import java.io.RandomAccessFile;
import java.nio.ByteBuffer;
import java.nio.channels.Channels;
import java.nio.channels.FileChannel;
import java.sql.ResultSet;
import java.sql.ResultSetMetaData;
import java.sql.SQLException;
import java.sql.Types;
import java.util.ArrayList;
import java.util.Arrays;
import java.util.Date;
import java.util.HashMap;
import java.util.HashSet;
import java.util.Iterator;
import java.util.LinkedList;
import java.util.List;
import java.util.Map;
import java.util.Set;
import org.apache.commons.lang.builder.ToStringBuilder;
import org.apache.commons.logging.Log;
import org.apache.commons.logging.LogFactory;

/**
* An Access database.
*
* @author Tim McCune
*/
public class Database {
private static final Log LOG = LogFactory.getLog(Database.class);
private static final byte[] SID = new byte[2];
static {
SID[0] = (byte) 0xA6;
SID[1] = (byte) 0x33;
}
/** Batch commit size for copying other result sets into this database */
private static final int COPY_TABLE_BATCH_SIZE = 200;
/** System catalog always lives on page 2 */
private static final int PAGE_SYSTEM_CATALOG = 2;
private static final Integer ACM = new Integer(1048319);
/** Free space left in page for new usage map definition pages */
private static final short USAGE_MAP_DEF_FREE_SPACE = 3940;
private static final String COL_ACM = "ACM";
/** System catalog column name of the date a system object was created */
private static final String COL_DATE_CREATE = "DateCreate";
/** System catalog column name of the date a system object was updated */
private static final String COL_DATE_UPDATE = "DateUpdate";
private static final String COL_F_INHERITABLE = "FInheritable";
private static final String COL_FLAGS = "Flags";
/**
* System catalog column name of the page on which system object definitions
* are stored
*/
private static final String COL_ID = "Id";
/** System catalog column name of the name of a system object */
private static final String COL_NAME = "Name";
private static final String COL_OBJECT_ID = "ObjectId";
private static final String COL_OWNER = "Owner";
/** System catalog column name of a system object's parent's id */
private static final String COL_PARENT_ID = "ParentId";
private static final String COL_SID = "SID";
/** System catalog column name of the type of a system object */
private static final String COL_TYPE = "Type";
/** Empty database template for creating new databases */
private static final String EMPTY_MDB = "com/healthmarketscience/jackcess/empty.mdb";
/** Prefix for column or table names that are reserved words */
private static final String ESCAPE_PREFIX = "x";
/** Prefix that flags system tables */
private static final String PREFIX_SYSTEM = "MSys";
/** Name of the system object that is the parent of all tables */
private static final String SYSTEM_OBJECT_NAME_TABLES = "Tables";
/** Name of the table that contains system access control entries */
private static final String TABLE_SYSTEM_ACES = "MSysACEs";
/** System object type for table definitions */
private static final Short TYPE_TABLE = new Short((short) 1);
/**
* All of the reserved words in Access that should be escaped when creating
* table or column names (String)
*/
private static final Set RESERVED_WORDS = new HashSet();
static {
//Yup, there's a lot.
RESERVED_WORDS.addAll(Arrays.asList(new String[] {
"add", "all", "alphanumeric", "alter", "and", "any", "application", "as",
"asc", "assistant", "autoincrement", "avg", "between", "binary", "bit",
"boolean", "by", "byte", "char", "character", "column", "compactdatabase",
"constraint", "container", "count", "counter", "create", "createdatabase",
"createfield", "creategroup", "createindex", "createobject", "createproperty",
"createrelation", "createtabledef", "createuser", "createworkspace",
"currency", "currentuser", "database", "date", "datetime", "delete",
"desc", "description", "disallow", "distinct", "distinctrow", "document",
"double", "drop", "echo", "else", "end", "eqv", "error", "exists", "exit",
"false", "field", "fields", "fillcache", "float", "float4", "float8",
"foreign", "form", "forms", "from", "full", "function", "general",
"getobject", "getoption", "gotopage", "group", "group by", "guid", "having",
"idle", "ieeedouble", "ieeesingle", "if", "ignore", "imp", "in", "index",
"indexes", "inner", "insert", "inserttext", "int", "integer", "integer1",
"integer2", "integer4", "into", "is", "join", "key", "lastmodified", "left",
"level", "like", "logical", "logical1", "long", "longbinary", "longtext",
"macro", "match", "max", "min", "mod", "memo", "module", "money", "move",
"name", "newpassword", "no", "not", "null", "number", "numeric", "object",
"oleobject", "off", "on", "openrecordset", "option", "or", "order", "outer",
"owneraccess", "parameter", "parameters", "partial", "percent", "pivot",
"primary", "procedure", "property", "queries", "query", "quit", "real",
"recalc", "recordset", "references", "refresh", "refreshlink",
"registerdatabase", "relation", "repaint", "repairdatabase", "report",
"reports", "requery", "right", "screen", "section", "select", "set",
"setfocus", "setoption", "short", "single", "smallint", "some", "sql",
"stdev", "stdevp", "string", "sum", "table", "tabledef", "tabledefs",
"tableid", "text", "time", "timestamp", "top", "transform", "true", "type",
"union", "unique", "update", "user", "value", "values", "var", "varp",
"varbinary", "varchar", "where", "with", "workspace", "xor", "year", "yes",
"yesno"
}));
}
/** Buffer to hold database pages */
private ByteBuffer _buffer;
/** ID of the Tables system object */
private Integer _tableParentId;
/** Format that the containing database is in */
private JetFormat _format;
/**
* Map of table names to page numbers containing their definition
* (String -> Integer)
*/
private Map _tables = new HashMap();
/** Reads and writes database pages */
private PageChannel _pageChannel;
/** System catalog table */
private Table _systemCatalog;
/** System access control entries table */
private Table _accessControlEntries;
/**
* Open an existing Database
* @param mdbFile File containing the database
*/
public static Database open(File mdbFile) throws IOException {
return new Database(openChannel(mdbFile));
}
/**
* Create a new Database
* @param mdbFile Location to write the new database to. <b>If this file
* already exists, it will be overwritten.</b>
*/
public static Database create(File mdbFile) throws IOException {
FileChannel channel = openChannel(mdbFile);
channel.transferFrom(Channels.newChannel(
Thread.currentThread().getContextClassLoader().getResourceAsStream(
EMPTY_MDB)), 0, (long) Integer.MAX_VALUE);
return new Database(channel);
}
private static FileChannel openChannel(File mdbFile) throws FileNotFoundException {
return new RandomAccessFile(mdbFile, "rw").getChannel();
}
/**
* Create a new database by reading it in from a FileChannel.
* @param channel File channel of the database. This needs to be a
* FileChannel instead of a ReadableByteChannel because we need to
* randomly jump around to various points in the file.
*/
protected Database(FileChannel channel) throws IOException {
_format = JetFormat.getFormat(channel);
_pageChannel = new PageChannel(channel, _format);
_buffer = _pageChannel.createPageBuffer();
readSystemCatalog();
}
public PageChannel getPageChannel() {
return _pageChannel;
}
/**
* @return The system catalog table
*/
public Table getSystemCatalog() {
return _systemCatalog;
}
public Table getAccessControlEntries() {
return _accessControlEntries;
}
/**
* Read the system catalog
*/
private void readSystemCatalog() throws IOException {
_pageChannel.readPage(_buffer, PAGE_SYSTEM_CATALOG);
byte pageType = _buffer.get();
if (pageType != PageTypes.TABLE_DEF) {
throw new IOException("Looking for system catalog at page " +
PAGE_SYSTEM_CATALOG + ", but page type is " + pageType);
}
_systemCatalog = new Table(_buffer, _pageChannel, _format, PAGE_SYSTEM_CATALOG);
Map row;
while ( (row = _systemCatalog.getNextRow(Arrays.asList(
new String[] {COL_NAME, COL_TYPE, COL_ID}))) != null)
{
String name = (String) row.get(COL_NAME);
if (name != null && TYPE_TABLE.equals(row.get(COL_TYPE))) {
if (!name.startsWith(PREFIX_SYSTEM)) {
_tables.put(row.get(COL_NAME), row.get(COL_ID));
} else if (TABLE_SYSTEM_ACES.equals(name)) {
readAccessControlEntries(((Integer) row.get(COL_ID)).intValue());
}
} else if (SYSTEM_OBJECT_NAME_TABLES.equals(name)) {
_tableParentId = (Integer) row.get(COL_ID);
}
}
if (LOG.isDebugEnabled()) {
LOG.debug("Finished reading system catalog. Tables: " + _tables);
}
}
/**
* Read the system access control entries table
* @param pageNum Page number of the table def
*/
private void readAccessControlEntries(int pageNum) throws IOException {
ByteBuffer buffer = _pageChannel.createPageBuffer();
_pageChannel.readPage(buffer, pageNum);
byte pageType = buffer.get();
if (pageType != PageTypes.TABLE_DEF) {
throw new IOException("Looking for MSysACEs at page " + pageNum +
", but page type is " + pageType);
}
_accessControlEntries = new Table(buffer, _pageChannel, _format, pageNum);
}
/**
* @return The names of all of the user tables (String)
*/
public Set getTableNames() {
return _tables.keySet();
}
/**
* @param name Table name
* @return The table, or null if it doesn't exist
*/
public Table getTable(String name) throws IOException {
Integer pageNumber = (Integer) _tables.get(name);
if (pageNumber == null) {
// Bug workaround:
pageNumber = (Integer) _tables.get(Character.toUpperCase(name.charAt(0)) +
name.substring(1));
}
if (pageNumber == null) {
return null;
} else {
_pageChannel.readPage(_buffer, pageNumber.intValue());
return new Table(_buffer, _pageChannel, _format, pageNumber.intValue());
}
}
/**
* Create a new table in this database
* @param name Name of the table to create
* @param columns List of Columns in the table
*/
//XXX Set up 1-page rollback buffer?
public void createTable(String name, List columns) throws IOException {
//There is some really bizarre bug in here where tables that start with
//the letters a-m (only lower case) won't open in Access. :)
name = Character.toUpperCase(name.charAt(0)) + name.substring(1);
//We are creating a new page at the end of the db for the tdef.
int pageNumber = _pageChannel.getPageCount();
ByteBuffer buffer = _pageChannel.createPageBuffer();
writeTableDefinition(buffer, columns, pageNumber);
writeColumnDefinitions(buffer, columns);
//End of tabledef
buffer.put((byte) 0xff);
buffer.put((byte) 0xff);
buffer.putInt(8, buffer.position()); //Overwrite length of data for this page
//Write the tdef and usage map pages to disk.
_pageChannel.writeNewPage(buffer);
_pageChannel.writeNewPage(createUsageMapDefinitionBuffer(pageNumber));
_pageChannel.writeNewPage(createUsageMapDataBuffer()); //Usage map
//Add this table to our internal list.
_tables.put(name, new Integer(pageNumber));
//Add this table to system tables
addToSystemCatalog(name, pageNumber);
addToAccessControlEntries(pageNumber);
}
/**
* @param buffer Buffer to write to
* @param columns List of Columns in the table
* @param pageNumber Page number that this table definition will be written to
*/
private void writeTableDefinition(ByteBuffer buffer, List columns, int pageNumber)
throws IOException {
//Start writing the tdef
buffer.put(PageTypes.TABLE_DEF); //Page type
buffer.put((byte) 0x01); //Unknown
buffer.put((byte) 0); //Unknown
buffer.put((byte) 0); //Unknown
buffer.putInt(0); //Next TDEF page pointer
buffer.putInt(0); //Length of data for this page
buffer.put((byte) 0x59); //Unknown
buffer.put((byte) 0x06); //Unknown
buffer.putShort((short) 0); //Unknown
buffer.putInt(0); //Number of rows
buffer.putInt(0); //Autonumber
for (int i = 0; i < 16; i++) { //Unknown
buffer.put((byte) 0);
}
buffer.put(Table.TYPE_USER); //Table type
buffer.putShort((short) columns.size()); //Max columns a row will have
buffer.putShort(Column.countVariableLength(columns)); //Number of variable columns in table
buffer.putShort((short) columns.size()); //Number of columns in table
buffer.putInt(0); //Number of indexes in table
buffer.putInt(0); //Number of indexes in table
buffer.put((byte) 0); //Usage map row number
int usageMapPage = pageNumber + 1;
buffer.put(ByteUtil.to3ByteInt(usageMapPage)); //Usage map page number
buffer.put((byte) 1); //Free map row number
buffer.put(ByteUtil.to3ByteInt(usageMapPage)); //Free map page number
if (LOG.isDebugEnabled()) {
int position = buffer.position();
buffer.rewind();
LOG.debug("Creating new table def block:\n" + ByteUtil.toHexString(
buffer, _format.SIZE_TDEF_BLOCK));
buffer.position(position);
}
}
/**
* @param buffer Buffer to write to
* @param columns List of Columns to write definitions for
*/
private void writeColumnDefinitions(ByteBuffer buffer, List columns)
throws IOException {
Iterator iter;
short columnNumber = (short) 0;
short fixedOffset = (short) 0;
short variableOffset = (short) 0;
for (iter = columns.iterator(); iter.hasNext(); columnNumber++) {
Column col = (Column) iter.next();
int position = buffer.position();
buffer.put(col.getType());
buffer.put((byte) 0x59); //Unknown
buffer.put((byte) 0x06); //Unknown
buffer.putShort((short) 0); //Unknown
buffer.putShort(columnNumber); //Column Number
if (col.isVariableLength()) {
buffer.putShort(variableOffset++);
} else {
buffer.putShort((short) 0);
}
buffer.putShort(columnNumber); //Column Number again
buffer.put((byte) 0x09); //Unknown
buffer.put((byte) 0x04); //Unknown
buffer.putShort((short) 0); //Unknown
if (col.isVariableLength()) { //Variable length
buffer.put((byte) 0x2);
} else {
buffer.put((byte) 0x3);
}
if (col.isCompressedUnicode()) { //Compressed
buffer.put((byte) 1);
} else {
buffer.put((byte) 0);
}
buffer.putInt(0); //Unknown, but always 0.
//Offset for fixed length columns
if (col.isVariableLength()) {
buffer.putShort((short) 0);
} else {
buffer.putShort(fixedOffset);
fixedOffset += col.size();
}
buffer.putShort(col.getLength()); //Column length
if (LOG.isDebugEnabled()) {
LOG.debug("Creating new column def block\n" + ByteUtil.toHexString(
buffer, position, _format.SIZE_COLUMN_DEF_BLOCK));
}
}
iter = columns.iterator();
while (iter.hasNext()) {
Column col = (Column) iter.next();
ByteBuffer colName = _format.CHARSET.encode(col.getName());
buffer.putShort((short) colName.remaining());
buffer.put(colName);
}
}
/**
* Create the usage map definition page buffer. It will be stored on the page
* immediately after the tdef page.
* @param pageNumber Page number that the corresponding table definition will
* be written to
*/
private ByteBuffer createUsageMapDefinitionBuffer(int pageNumber) throws IOException {
ByteBuffer rtn = _pageChannel.createPageBuffer();
rtn.put(PageTypes.DATA);
rtn.put((byte) 0x1); //Unknown
rtn.putShort(USAGE_MAP_DEF_FREE_SPACE); //Free space in page
rtn.putInt(0); //Table definition
rtn.putInt(0); //Unknown
rtn.putShort((short) 2); //Number of records on this page
rtn.putShort((short) _format.OFFSET_USED_PAGES_USAGE_MAP_DEF); //First location
rtn.putShort((short) _format.OFFSET_FREE_PAGES_USAGE_MAP_DEF); //Second location
rtn.position(_format.OFFSET_USED_PAGES_USAGE_MAP_DEF);
rtn.put((byte) UsageMap.MAP_TYPE_REFERENCE);
rtn.putInt(pageNumber + 2); //First referenced page number
rtn.position(_format.OFFSET_FREE_PAGES_USAGE_MAP_DEF);
rtn.put((byte) UsageMap.MAP_TYPE_INLINE);
return rtn;
}
/**
* Create a usage map data page buffer.
*/
private ByteBuffer createUsageMapDataBuffer() throws IOException {
ByteBuffer rtn = _pageChannel.createPageBuffer();
rtn.put(PageTypes.USAGE_MAP);
rtn.put((byte) 0x01); //Unknown
rtn.putShort((short) 0); //Unknown
return rtn;
}
/**
* Add a new table to the system catalog
* @param name Table name
* @param pageNumber Page number that contains the table definition
*/
private void addToSystemCatalog(String name, int pageNumber) throws IOException {
Object[] catalogRow = new Object[_systemCatalog.getColumns().size()];
int idx = 0;
Iterator iter;
for (iter = _systemCatalog.getColumns().iterator(); iter.hasNext(); idx++) {
Column col = (Column) iter.next();
if (COL_ID.equals(col.getName())) {
catalogRow[idx] = new Integer(pageNumber);
} else if (COL_NAME.equals(col.getName())) {
catalogRow[idx] = name;
} else if (COL_TYPE.equals(col.getName())) {
catalogRow[idx] = TYPE_TABLE;
} else if (COL_DATE_CREATE.equals(col.getName()) ||
COL_DATE_UPDATE.equals(col.getName()))
{
catalogRow[idx] = new Date();
} else if (COL_PARENT_ID.equals(col.getName())) {
catalogRow[idx] = _tableParentId;
} else if (COL_FLAGS.equals(col.getName())) {
catalogRow[idx] = new Integer(0);
} else if (COL_OWNER.equals(col.getName())) {
byte[] owner = new byte[2];
catalogRow[idx] = owner;
owner[0] = (byte) 0xcf;
owner[1] = (byte) 0x5f;
}
}
_systemCatalog.addRow(catalogRow);
}
/**
* Add a new table to the system's access control entries
* @param pageNumber Page number that contains the table definition
*/
private void addToAccessControlEntries(int pageNumber) throws IOException {
Object[] aceRow = new Object[_accessControlEntries.getColumns().size()];
int idx = 0;
Iterator iter;
for (iter = _accessControlEntries.getColumns().iterator(); iter.hasNext(); idx++) {
Column col = (Column) iter.next();
if (col.getName().equals(COL_ACM)) {
aceRow[idx] = ACM;
} else if (col.getName().equals(COL_F_INHERITABLE)) {
aceRow[idx] = Boolean.FALSE;
} else if (col.getName().equals(COL_OBJECT_ID)) {
aceRow[idx] = new Integer(pageNumber);
} else if (col.getName().equals(COL_SID)) {
aceRow[idx] = SID;
}
}
_accessControlEntries.addRow(aceRow);
}
/**
* Copy an existing JDBC ResultSet into a new table in this database
* @param name Name of the new table to create
* @param source ResultSet to copy from
*/
public void copyTable(String name, ResultSet source) throws SQLException, IOException {
ResultSetMetaData md = source.getMetaData();
List columns = new LinkedList();
int textCount = 0;
int totalSize = 0;
for (int i = 1; i <= md.getColumnCount(); i++) {
switch (md.getColumnType(i)) {
case Types.INTEGER:
case Types.FLOAT:
totalSize += 4;
break;
case Types.DOUBLE:
case Types.DATE:
totalSize += 8;
break;
case Types.VARCHAR:
textCount++;
break;
}
}
short textSize = 0;
if (textCount > 0) {
textSize = (short) ((_format.MAX_RECORD_SIZE - totalSize) / textCount);
if (textSize > _format.TEXT_FIELD_MAX_LENGTH) {
textSize = _format.TEXT_FIELD_MAX_LENGTH;
}
}
for (int i = 1; i <= md.getColumnCount(); i++) {
Column column = new Column();
column.setName(escape(md.getColumnName(i)));
column.setType(DataTypes.fromSQLType(md.getColumnType(i)));
if (column.getType() == DataTypes.TEXT) {
column.setLength(textSize);
}
columns.add(column);
}
createTable(escape(name), columns);
Table table = getTable(escape(name));
List rows = new ArrayList();
while (source.next()) {
Object[] row = new Object[md.getColumnCount()];
for (int i = 0; i < row.length; i++) {
row[i] = source.getObject(i + 1);
}
rows.add(row);
if (rows.size() == COPY_TABLE_BATCH_SIZE) {
table.addRows(rows);
rows.clear();
}
}
if (rows.size() > 0) {
table.addRows(rows);
}
}
/**
* Copy a delimited text file into a new table in this database
* @param name Name of the new table to create
* @param f Source file to import
* @param delim Regular expression representing the delimiter string.
*/
public void importFile(String name, File f,
String delim)
throws IOException
{
BufferedReader in = null;
try
{
in = new BufferedReader(new FileReader(f));
importReader(name, in, delim);
}
finally
{
if (in != null)
{
try
{
in.close();
}
catch (IOException ex)
{
LOG.warn("Could not close file " + f.getAbsolutePath(), ex);
}
}
}
}


/**
* Copy a delimited text file into a new table in this database
* @param name Name of the new table to create
* @param in Source reader to import
* @param delim Regular expression representing the delimiter string.
*/
public void importReader(String name, BufferedReader in,
String delim)
throws IOException
{
String line = in.readLine();
if (line == null || line.trim().length() == 0)
{
return;
}

String tableName = escape(name);
int counter = 0;
while(getTable(tableName) != null)
{
tableName = escape(name + (counter++));
}

List columns = new LinkedList();
String[] columnNames = line.split(delim);
short textSize = (short) ((_format.MAX_RECORD_SIZE) / columnNames.length);
if (textSize > _format.TEXT_FIELD_MAX_LENGTH) {
textSize = _format.TEXT_FIELD_MAX_LENGTH;
}

for (int i = 0; i < columnNames.length; i++) {
Column column = new Column();
column.setName(escape(columnNames[i]));
column.setType(DataTypes.TEXT);
column.setLength(textSize);
columns.add(column);
}
createTable(tableName, columns);
Table table = getTable(tableName);
List rows = new ArrayList();
while ((line = in.readLine()) != null)
{
//
// Handle the situation where the end of the line
// may have null fields. We always want to add the
// same number of columns to the table each time.
//
String[] data = new String[columnNames.length];
String[] splitData = line.split(delim);
System.arraycopy(splitData, 0, data, 0, splitData.length);
rows.add(data);
if (rows.size() == COPY_TABLE_BATCH_SIZE) {
table.addRows(rows);
rows.clear();
}
}
if (rows.size() > 0) {
table.addRows(rows);
}
}
/**
* Close the database file
*/
public void close() throws IOException {
_pageChannel.close();
}
/**
* @return A table or column name escaped for Access
*/
private String escape(String s) {
if (RESERVED_WORDS.contains(s.toLowerCase())) {
return ESCAPE_PREFIX + s;
} else {
return s;
}
}
public String toString() {
return ToStringBuilder.reflectionToString(this);
}
}

+ 506
- 0
src/java/com/healthmarketscience/jackcess/Index.java View File

@@ -0,0 +1,506 @@
/*
Copyright (c) 2005 Health Market Science, Inc.

This library is free software; you can redistribute it and/or
modify it under the terms of the GNU Lesser General Public
License as published by the Free Software Foundation; either
version 2.1 of the License, or (at your option) any later version.

This library is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
Lesser General Public License for more details.

You should have received a copy of the GNU Lesser General Public
License along with this library; if not, write to the Free Software
Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307
USA

You can contact Health Market Science at info@healthmarketscience.com
or at the following address:

Health Market Science
2700 Horizon Drive
Suite 200
King of Prussia, PA 19406
*/

package com.healthmarketscience.jackcess;

import java.io.IOException;
import java.nio.ByteBuffer;
import java.nio.ByteOrder;
import java.util.ArrayList;
import java.util.Iterator;
import java.util.LinkedHashMap;
import java.util.List;
import java.util.Map;
import java.util.SortedSet;
import java.util.TreeSet;
import org.apache.commons.collections.bidimap.DualHashBidiMap;
import org.apache.commons.collections.BidiMap;
import org.apache.commons.lang.builder.CompareToBuilder;

/**
* Access table index
* @author Tim McCune
*/
public class Index implements Comparable {
/** Max number of columns in an index */
private static final int MAX_COLUMNS = 10;
private static final short COLUMN_UNUSED = -1;
/**
* Map of characters to bytes that Access uses in indexes (not ASCII)
* (Character -> Byte)
*/
private static BidiMap CODES = new DualHashBidiMap();
static {
//These values are prefixed with a '43'
CODES.put(new Character('^'), new Byte((byte) 2));
CODES.put(new Character('_'), new Byte((byte) 3));
CODES.put(new Character('{'), new Byte((byte) 9));
CODES.put(new Character('|'), new Byte((byte) 11));
CODES.put(new Character('}'), new Byte((byte) 13));
CODES.put(new Character('~'), new Byte((byte) 15));
//These values aren't.
CODES.put(new Character(' '), new Byte((byte) 7));
CODES.put(new Character('#'), new Byte((byte) 12));
CODES.put(new Character('$'), new Byte((byte) 14));
CODES.put(new Character('%'), new Byte((byte) 16));
CODES.put(new Character('&'), new Byte((byte) 18));
CODES.put(new Character('('), new Byte((byte) 20));
CODES.put(new Character(')'), new Byte((byte) 22));
CODES.put(new Character('*'), new Byte((byte) 24));
CODES.put(new Character(','), new Byte((byte) 26));
CODES.put(new Character('/'), new Byte((byte) 30));
CODES.put(new Character(':'), new Byte((byte) 32));
CODES.put(new Character(';'), new Byte((byte) 34));
CODES.put(new Character('?'), new Byte((byte) 36));
CODES.put(new Character('@'), new Byte((byte) 38));
CODES.put(new Character('+'), new Byte((byte) 44));
CODES.put(new Character('<'), new Byte((byte) 46));
CODES.put(new Character('='), new Byte((byte) 48));
CODES.put(new Character('>'), new Byte((byte) 50));
CODES.put(new Character('0'), new Byte((byte) 54));
CODES.put(new Character('1'), new Byte((byte) 56));
CODES.put(new Character('2'), new Byte((byte) 58));
CODES.put(new Character('3'), new Byte((byte) 60));
CODES.put(new Character('4'), new Byte((byte) 62));
CODES.put(new Character('5'), new Byte((byte) 64));
CODES.put(new Character('6'), new Byte((byte) 66));
CODES.put(new Character('7'), new Byte((byte) 68));
CODES.put(new Character('8'), new Byte((byte) 70));
CODES.put(new Character('9'), new Byte((byte) 72));
CODES.put(new Character('A'), new Byte((byte) 74));
CODES.put(new Character('B'), new Byte((byte) 76));
CODES.put(new Character('C'), new Byte((byte) 77));
CODES.put(new Character('D'), new Byte((byte) 79));
CODES.put(new Character('E'), new Byte((byte) 81));
CODES.put(new Character('F'), new Byte((byte) 83));
CODES.put(new Character('G'), new Byte((byte) 85));
CODES.put(new Character('H'), new Byte((byte) 87));
CODES.put(new Character('I'), new Byte((byte) 89));
CODES.put(new Character('J'), new Byte((byte) 91));
CODES.put(new Character('K'), new Byte((byte) 92));
CODES.put(new Character('L'), new Byte((byte) 94));
CODES.put(new Character('M'), new Byte((byte) 96));
CODES.put(new Character('N'), new Byte((byte) 98));
CODES.put(new Character('O'), new Byte((byte) 100));
CODES.put(new Character('P'), new Byte((byte) 102));
CODES.put(new Character('Q'), new Byte((byte) 104));
CODES.put(new Character('R'), new Byte((byte) 105));
CODES.put(new Character('S'), new Byte((byte) 107));
CODES.put(new Character('T'), new Byte((byte) 109));
CODES.put(new Character('U'), new Byte((byte) 111));
CODES.put(new Character('V'), new Byte((byte) 113));
CODES.put(new Character('W'), new Byte((byte) 115));
CODES.put(new Character('X'), new Byte((byte) 117));
CODES.put(new Character('Y'), new Byte((byte) 118));
CODES.put(new Character('Z'), new Byte((byte) 120));
}
/** Page number of the index data */
private int _pageNumber;
private int _parentPageNumber;
/** Number of rows in the index */
private int _rowCount;
private JetFormat _format;
private List _allColumns;
private SortedSet _entries = new TreeSet();
/** Map of columns to order (Column -> Byte) */
private Map _columns = new LinkedHashMap();
private PageChannel _pageChannel;
/** 0-based index number */
private int _indexNumber;
/** Index name */
private String _name;
public Index(int parentPageNumber, PageChannel channel, JetFormat format) {
_parentPageNumber = parentPageNumber;
_pageChannel = channel;
_format = format;
}
public void setIndexNumber(int indexNumber) {
_indexNumber = indexNumber;
}
public int getIndexNumber() {
return _indexNumber;
}
public void setRowCount(int rowCount) {
_rowCount = rowCount;
}
public void setName(String name) {
_name = name;
}
public void update() throws IOException {
_pageChannel.writePage(write(), _pageNumber);
}
/**
* Write this index out to a buffer
*/
public ByteBuffer write() throws IOException {
ByteBuffer buffer = _pageChannel.createPageBuffer();
buffer.put((byte) 0x04); //Page type
buffer.put((byte) 0x01); //Unknown
buffer.putShort((short) 0); //Free space
buffer.putInt(_parentPageNumber);
buffer.putInt(0); //Prev page
buffer.putInt(0); //Next page
buffer.putInt(0); //Leaf page
buffer.putInt(0); //Unknown
buffer.put((byte) 0); //Unknown
buffer.put((byte) 0); //Unknown
buffer.put((byte) 0); //Unknown
byte[] entryMask = new byte[_format.SIZE_INDEX_ENTRY_MASK];
int totalSize = 0;
Iterator iter = _entries.iterator();
while (iter.hasNext()) {
Entry entry = (Entry) iter.next();
int size = entry.size();
totalSize += size;
int idx = totalSize / 8;
entryMask[idx] |= (1 << (totalSize % 8));
}
buffer.put(entryMask);
iter = _entries.iterator();
while (iter.hasNext()) {
Entry entry = (Entry) iter.next();
entry.write(buffer);
}
buffer.putShort(2, (short) (_format.PAGE_SIZE - buffer.position()));
return buffer;
}
/**
* Read this index in from a buffer
* @param buffer Buffer to read from
* @param availableColumns Columns that this index may use
*/
public void read(ByteBuffer buffer, List availableColumns)
throws IOException
{
_allColumns = availableColumns;
for (int i = 0; i < MAX_COLUMNS; i++) {
short columnNumber = buffer.getShort();
Byte order = new Byte(buffer.get());
if (columnNumber != COLUMN_UNUSED) {
_columns.put(availableColumns.get(columnNumber), order);
}
}
buffer.getInt(); //Forward past Unknown
_pageNumber = buffer.getInt();
buffer.position(buffer.position() + 10); //Forward past other stuff
ByteBuffer indexPage = _pageChannel.createPageBuffer();
_pageChannel.readPage(indexPage, _pageNumber);
indexPage.position(_format.OFFSET_INDEX_ENTRY_MASK);
byte[] entryMask = new byte[_format.SIZE_INDEX_ENTRY_MASK];
indexPage.get(entryMask);
int lastStart = 0;
for (int i = 0; i < entryMask.length; i++) {
for (int j = 0; j < 8; j++) {
if ((entryMask[i] & (1 << j)) != 0) {
int length = i * 8 + j - lastStart;
_entries.add(new Entry(indexPage));
lastStart += length;
}
}
}
}
/**
* Add a row to this index
* @param row Row to add
* @param pageNumber Page number on which the row is stored
* @param rowNumber Row number at which the row is stored
*/
public void addRow(Object[] row, int pageNumber, byte rowNumber) {
_entries.add(new Entry(row, pageNumber, rowNumber));
}
public String toString() {
StringBuffer rtn = new StringBuffer();
rtn.append("\tName: " + _name);
rtn.append("\n\tNumber: " + _indexNumber);
rtn.append("\n\tPage number: " + _pageNumber);
rtn.append("\n\tColumns: " + _columns);
rtn.append("\n\tEntries: " + _entries);
rtn.append("\n\n");
return rtn.toString();
}
public int compareTo(Object obj) {
Index other = (Index) obj;
if (_indexNumber > other.getIndexNumber()) {
return 1;
} else if (_indexNumber < other.getIndexNumber()) {
return -1;
} else {
return 0;
}
}
/**
* A single entry in an index (points to a single row)
*/
private class Entry implements Comparable {
/** Page number on which the row is stored */
private int _page;
/** Row number at which the row is stored */
private byte _row;
/** Columns that are indexed */
private List _entryColumns = new ArrayList();
/**
* Create a new entry
* @param values Indexed row values
* @param page Page number on which the row is stored
* @param rowNumber Row number at which the row is stored
*/
public Entry(Object[] values, int page, byte rowNumber) {
_page = page;
_row = rowNumber;
Iterator iter = _columns.keySet().iterator();
while (iter.hasNext()) {
Column col = (Column) iter.next();
Object value = values[col.getColumnNumber()];
_entryColumns.add(new EntryColumn(col, (Comparable) value));
}
}
/**
* Read an existing entry in from a buffer
*/
public Entry(ByteBuffer buffer) throws IOException {
Iterator iter = _columns.keySet().iterator();
while (iter.hasNext()) {
_entryColumns.add(new EntryColumn((Column) iter.next(), buffer));
}
//3-byte int in big endian order! Gotta love those kooky MS programmers. :)
_page = (((int) buffer.get()) & 0xFF) << 16;
_page += (((int) buffer.get()) & 0xFF) << 8;
_page += (int) buffer.get();
_row = buffer.get();
}
public List getEntryColumns() {
return _entryColumns;
}
public int getPage() {
return _page;
}
public byte getRow() {
return _row;
}
public int size() {
int rtn = 5;
Iterator iter = _entryColumns.iterator();
while (iter.hasNext()) {
rtn += ((EntryColumn) iter.next()).size();
}
return rtn;
}
/**
* Write this entry into a buffer
*/
public void write(ByteBuffer buffer) throws IOException {
Iterator iter = _entryColumns.iterator();
while (iter.hasNext()) {
((EntryColumn) iter.next()).write(buffer);
}
buffer.put((byte) (_page >>> 16));
buffer.put((byte) (_page >>> 8));
buffer.put((byte) _page);
buffer.put(_row);
}
public String toString() {
return ("Page = " + _page + ", Row = " + _row + ", Columns = " + _entryColumns + "\n");
}
public int compareTo(Object obj) {
if (this == obj) {
return 0;
}
Entry other = (Entry) obj;
Iterator myIter = _entryColumns.iterator();
Iterator otherIter = other.getEntryColumns().iterator();
while (myIter.hasNext()) {
if (!otherIter.hasNext()) {
throw new IllegalArgumentException(
"Trying to compare index entries with a different number of entry columns");
}
EntryColumn myCol = (EntryColumn) myIter.next();
EntryColumn otherCol = (EntryColumn) otherIter.next();
int i = myCol.compareTo(otherCol);
if (i != 0) {
return i;
}
}
return new CompareToBuilder().append(_page, other.getPage())
.append(_row, other.getRow()).toComparison();
}
}
/**
* A single column value within an index Entry; encapsulates column
* definition and column value.
*/
private class EntryColumn implements Comparable {
/** Column definition */
private Column _column;
/** Column value */
private Comparable _value;
/**
* Create a new EntryColumn
*/
public EntryColumn(Column col, Comparable value) {
_column = col;
_value = value;
}
/**
* Read in an existing EntryColumn from a buffer
*/
public EntryColumn(Column col, ByteBuffer buffer) throws IOException {
_column = col;
byte flag = buffer.get();
if (flag != (byte) 0) {
if (col.getType() == DataTypes.TEXT) {
StringBuffer sb = new StringBuffer();
byte b;
while ( (b = buffer.get()) != (byte) 1) {
if ((int) b == 43) {
b = buffer.get();
}
Character c = (Character) CODES.getKey(new Byte(b));
if (c != null) {
sb.append(c.charValue());
}
}
buffer.get(); //Forward past 0x00
_value = sb.toString();
} else {
byte[] data = new byte[col.size()];
buffer.get(data);
_value = (Comparable) col.read(data, ByteOrder.BIG_ENDIAN);
//ints and shorts are stored in index as value + 2147483648
if (_value instanceof Integer) {
_value = new Integer((int) (((Integer) _value).longValue() + (long) Integer.MAX_VALUE + 1L));
} else if (_value instanceof Short) {
_value = new Short((short) (((Short) _value).longValue() + (long) Integer.MAX_VALUE + 1L));
}
}
}
}
public Comparable getValue() {
return _value;
}
/**
* Write this entry column to a buffer
*/
public void write(ByteBuffer buffer) throws IOException {
buffer.put((byte) 0x7F);
if (_column.getType() == DataTypes.TEXT) {
String s = (String) _value;
for (int i = 0; i < s.length(); i++) {
Byte b = (Byte) CODES.get(new Character(Character.toUpperCase(s.charAt(i))));
if (b == null) {
throw new IOException("Unmapped index value: " + s.charAt(i));
} else {
byte bv = b.byteValue();
//WTF is this? No idea why it's this way, but it is. :)
if (bv == (byte) 2 || bv == (byte) 3 || bv == (byte) 9 || bv == (byte) 11 ||
bv == (byte) 13 || bv == (byte) 15)
{
buffer.put((byte) 43); //Ah, the magic 43.
}
buffer.put(b.byteValue());
if (s.equals("_")) {
buffer.put((byte) 3);
}
}
}
buffer.put((byte) 1);
buffer.put((byte) 0);
} else {
Comparable value = _value;
if (value instanceof Integer) {
value = new Integer((int) (((Integer) value).longValue() - ((long) Integer.MAX_VALUE + 1L)));
} else if (value instanceof Short) {
value = new Short((short) (((Short) value).longValue() - ((long) Integer.MAX_VALUE + 1L)));
}
buffer.put(_column.write(value, ByteOrder.BIG_ENDIAN));
}
}
public int size() {
if (_value == null) {
return 0;
} else if (_value instanceof String) {
int rtn = 3;
String s = (String) _value;
for (int i = 0; i < s.length(); i++) {
rtn++;
if (s.charAt(i) == '^' || s.charAt(i) == '_' || s.charAt(i) == '{' ||
s.charAt(i) == '|' || s.charAt(i) == '}' || s.charAt(i) == '-')
{
rtn++;
}
}
return rtn;
} else {
return _column.size();
}
}
public String toString() {
return String.valueOf(_value);
}
public int compareTo(Object obj) {
return new CompareToBuilder().append(_value, ((EntryColumn) obj).getValue())
.toComparison();
}
}
}

+ 98
- 0
src/java/com/healthmarketscience/jackcess/InlineUsageMap.java View File

@@ -0,0 +1,98 @@
/*
Copyright (c) 2005 Health Market Science, Inc.

This library is free software; you can redistribute it and/or
modify it under the terms of the GNU Lesser General Public
License as published by the Free Software Foundation; either
version 2.1 of the License, or (at your option) any later version.

This library is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
Lesser General Public License for more details.

You should have received a copy of the GNU Lesser General Public
License along with this library; if not, write to the Free Software
Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307
USA

You can contact Health Market Science at info@healthmarketscience.com
or at the following address:

Health Market Science
2700 Horizon Drive
Suite 200
King of Prussia, PA 19406
*/

package com.healthmarketscience.jackcess;

import java.io.IOException;
import java.nio.ByteBuffer;

/**
* Usage map whose map is written inline in the same page. This type of map
* can contain a maximum of 512 pages, and is always used for free space maps.
* It has a start page, which all page numbers in its map are calculated as
* starting from.
* @author Tim McCune
*/
public class InlineUsageMap extends UsageMap {
/** Size in bytes of the map */
private static final int MAP_SIZE = 64;
/** First page that this usage map applies to */
private int _startPage = 0;
/**
* @param pageChannel Used to read in pages
* @param dataBuffer Buffer that contains this map's declaration
* @param pageNum Page number that this usage map is contained in
* @param format Format of the database that contains this usage map
* @param rowStart Offset at which the declaration starts in the buffer
*/
public InlineUsageMap(PageChannel pageChannel, ByteBuffer dataBuffer,
int pageNum, JetFormat format, short rowStart)
throws IOException
{
super(pageChannel, dataBuffer, pageNum, format, rowStart);
_startPage = dataBuffer.getInt(rowStart + 1);
processMap(dataBuffer, 0, _startPage);
}
//Javadoc copied from UsageMap
protected void addOrRemovePageNumber(final int pageNumber, boolean add)
throws IOException
{
if (add && pageNumber < _startPage) {
throw new IOException("Can't add page number " + pageNumber +
" because it is less than start page " + _startPage);
}
int relativePageNumber = pageNumber - _startPage;
ByteBuffer buffer = getDataBuffer();
if ((!add && !getPageNumbers().remove(new Integer(pageNumber))) || (add &&
(relativePageNumber > MAP_SIZE * 8 - 1)))
{
//Increase the start page to the current page and clear out the map.
_startPage = pageNumber;
buffer.position(getRowStart() + 1);
buffer.putInt(_startPage);
getPageNumbers().clear();
if (!add) {
for (int j = 0; j < MAP_SIZE; j++) {
buffer.put((byte) 0xff); //Fill bitmap with 1s
}
for (int j = _startPage; j < _startPage + MAP_SIZE * 8; j++) {
getPageNumbers().add(new Integer(j)); //Fill our list with page numbers
}
}
getPageChannel().writePage(buffer, getDataPageNumber());
relativePageNumber = pageNumber - _startPage;
}
updateMap(pageNumber, relativePageNumber, 1 << (relativePageNumber % 8), buffer, add);
//Write the updated map back to disk
getPageChannel().writePage(buffer, getDataPageNumber());
}
}

+ 302
- 0
src/java/com/healthmarketscience/jackcess/JetFormat.java View File

@@ -0,0 +1,302 @@
/*
Copyright (c) 2005 Health Market Science, Inc.

This library is free software; you can redistribute it and/or
modify it under the terms of the GNU Lesser General Public
License as published by the Free Software Foundation; either
version 2.1 of the License, or (at your option) any later version.

This library is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
Lesser General Public License for more details.

You should have received a copy of the GNU Lesser General Public
License along with this library; if not, write to the Free Software
Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307
USA

You can contact Health Market Science at info@healthmarketscience.com
or at the following address:

Health Market Science
2700 Horizon Drive
Suite 200
King of Prussia, PA 19406
*/

package com.healthmarketscience.jackcess;

import java.io.IOException;
import java.nio.ByteBuffer;
import java.nio.channels.FileChannel;
import java.nio.charset.Charset;

/**
* Encapsulates constants describing a specific version of the Access Jet format
* @author Tim McCune
*/
public abstract class JetFormat {
/** Maximum size of a record minus OLE objects and Memo fields */
public static final int MAX_RECORD_SIZE = 1900; //2kb minus some overhead
/** Maximum size of a text field */
public static final short TEXT_FIELD_MAX_LENGTH = 255 * 2;
/** Offset in the file that holds the byte describing the Jet format version */
private static final long OFFSET_VERSION = 20L;
/** Version code for Jet version 3 */
private static final byte CODE_VERSION_3 = 0x0;
/** Version code for Jet version 4 */
private static final byte CODE_VERSION_4 = 0x1;

//These constants are populated by this class's constructor. They can't be
//populated by the subclass's constructor because they are final, and Java
//doesn't allow this; hence all the abstract defineXXX() methods.
/** Database page size in bytes */
public final int PAGE_SIZE;
public final int MAX_ROW_SIZE;
public final int OFFSET_NEXT_TABLE_DEF_PAGE;
public final int OFFSET_NUM_ROWS;
public final int OFFSET_TABLE_TYPE;
public final int OFFSET_NUM_COLS;
public final int OFFSET_NUM_INDEXES;
public final int OFFSET_OWNED_PAGES;
public final int OFFSET_FREE_SPACE_PAGES;
public final int OFFSET_INDEX_DEF_BLOCK;
public final int OFFSET_COLUMN_TYPE;
public final int OFFSET_COLUMN_NUMBER;
public final int OFFSET_COLUMN_PRECISION;
public final int OFFSET_COLUMN_SCALE;
public final int OFFSET_COLUMN_VARIABLE;
public final int OFFSET_COLUMN_COMPRESSED_UNICODE;
public final int OFFSET_COLUMN_LENGTH;
public final int OFFSET_TABLE_DEF_LOCATION;
public final int OFFSET_NUM_ROWS_ON_PAGE;
public final int OFFSET_ROW_LOCATION_BLOCK;
public final int OFFSET_ROW_START;
public final int OFFSET_MAP_START;
public final int OFFSET_USAGE_MAP_PAGE_DATA;
public final int OFFSET_REFERENCE_MAP_PAGE_NUMBERS;
public final int OFFSET_FREE_SPACE;
public final int OFFSET_DATA_ROW_LOCATION_BLOCK;
public final int OFFSET_NUM_ROWS_ON_DATA_PAGE;
public final int OFFSET_LVAL_ROW_LOCATION_BLOCK;
public final int OFFSET_USED_PAGES_USAGE_MAP_DEF;
public final int OFFSET_FREE_PAGES_USAGE_MAP_DEF;
public final int OFFSET_INDEX_ENTRY_MASK;
public final int SIZE_INDEX_DEFINITION;
public final int SIZE_COLUMN_HEADER;
public final int SIZE_ROW_LOCATION;
public final int SIZE_LONG_VALUE_DEF;
public final int SIZE_TDEF_BLOCK;
public final int SIZE_COLUMN_DEF_BLOCK;
public final int SIZE_INDEX_ENTRY_MASK;
public final int PAGES_PER_USAGE_MAP_PAGE;
public final Charset CHARSET;
public static final JetFormat VERSION_4 = new Jet4Format();
/**
* @return The Jet Format represented in the passed-in file
*/
public static JetFormat getFormat(FileChannel channel) throws IOException {
ByteBuffer buffer = ByteBuffer.allocate(1);
channel.read(buffer, OFFSET_VERSION);
buffer.flip();
byte version = buffer.get();
if (version == CODE_VERSION_4) {
return VERSION_4;
} else {
throw new IOException("Unsupported version: " + version);
}
}
private JetFormat() {
PAGE_SIZE = definePageSize();
MAX_ROW_SIZE = defineMaxRowSize();
OFFSET_NEXT_TABLE_DEF_PAGE = defineOffsetNextTableDefPage();
OFFSET_NUM_ROWS = defineOffsetNumRows();
OFFSET_TABLE_TYPE = defineOffsetTableType();
OFFSET_NUM_COLS = defineOffsetNumCols();
OFFSET_NUM_INDEXES = defineOffsetNumIndexes();
OFFSET_OWNED_PAGES = defineOffsetOwnedPages();
OFFSET_FREE_SPACE_PAGES = defineOffsetFreeSpacePages();
OFFSET_INDEX_DEF_BLOCK = defineOffsetIndexDefBlock();
OFFSET_COLUMN_TYPE = defineOffsetColumnType();
OFFSET_COLUMN_NUMBER = defineOffsetColumnNumber();
OFFSET_COLUMN_PRECISION = defineOffsetColumnPrecision();
OFFSET_COLUMN_SCALE = defineOffsetColumnScale();
OFFSET_COLUMN_VARIABLE = defineOffsetColumnVariable();
OFFSET_COLUMN_COMPRESSED_UNICODE = defineOffsetColumnCompressedUnicode();
OFFSET_COLUMN_LENGTH = defineOffsetColumnLength();
OFFSET_TABLE_DEF_LOCATION = defineOffsetTableDefLocation();
OFFSET_NUM_ROWS_ON_PAGE = defineOffsetNumRowsOnPage();
OFFSET_ROW_LOCATION_BLOCK = defineOffsetRowLocationBlock();
OFFSET_ROW_START = defineOffsetRowStart();
OFFSET_MAP_START = defineOffsetMapStart();
OFFSET_USAGE_MAP_PAGE_DATA = defineOffsetUsageMapPageData();
OFFSET_REFERENCE_MAP_PAGE_NUMBERS = defineOffsetReferenceMapPageNumbers();
OFFSET_FREE_SPACE = defineOffsetFreeSpace();
OFFSET_DATA_ROW_LOCATION_BLOCK = defineOffsetDataRowLocationBlock();
OFFSET_NUM_ROWS_ON_DATA_PAGE = defineOffsetNumRowsOnDataPage();
OFFSET_LVAL_ROW_LOCATION_BLOCK = defineOffsetLvalRowLocationBlock();
OFFSET_USED_PAGES_USAGE_MAP_DEF = defineOffsetUsedPagesUsageMapDef();
OFFSET_FREE_PAGES_USAGE_MAP_DEF = defineOffsetFreePagesUsageMapDef();
OFFSET_INDEX_ENTRY_MASK = defineOffsetIndexEntryMask();
SIZE_INDEX_DEFINITION = defineSizeIndexDefinition();
SIZE_COLUMN_HEADER = defineSizeColumnHeader();
SIZE_ROW_LOCATION = defineSizeRowLocation();
SIZE_LONG_VALUE_DEF = defineSizeLongValueDef();
SIZE_TDEF_BLOCK = defineSizeTdefBlock();
SIZE_COLUMN_DEF_BLOCK = defineSizeColumnDefBlock();
SIZE_INDEX_ENTRY_MASK = defineSizeIndexEntryMask();
PAGES_PER_USAGE_MAP_PAGE = definePagesPerUsageMapPage();
CHARSET = defineCharset();
}
protected abstract int definePageSize();
protected abstract int defineMaxRowSize();
protected abstract int defineOffsetNextTableDefPage();
protected abstract int defineOffsetNumRows();
protected abstract int defineOffsetTableType();
protected abstract int defineOffsetNumCols();
protected abstract int defineOffsetNumIndexes();
protected abstract int defineOffsetOwnedPages();
protected abstract int defineOffsetFreeSpacePages();
protected abstract int defineOffsetIndexDefBlock();
protected abstract int defineOffsetColumnType();
protected abstract int defineOffsetColumnNumber();
protected abstract int defineOffsetColumnPrecision();
protected abstract int defineOffsetColumnScale();
protected abstract int defineOffsetColumnVariable();
protected abstract int defineOffsetColumnCompressedUnicode();
protected abstract int defineOffsetColumnLength();

protected abstract int defineOffsetTableDefLocation();
protected abstract int defineOffsetNumRowsOnPage();
protected abstract int defineOffsetRowLocationBlock();
protected abstract int defineOffsetRowStart();
protected abstract int defineOffsetMapStart();
protected abstract int defineOffsetUsageMapPageData();
protected abstract int defineOffsetReferenceMapPageNumbers();
protected abstract int defineOffsetFreeSpace();
protected abstract int defineOffsetDataRowLocationBlock();
protected abstract int defineOffsetNumRowsOnDataPage();
protected abstract int defineOffsetLvalRowLocationBlock();
protected abstract int defineOffsetUsedPagesUsageMapDef();
protected abstract int defineOffsetFreePagesUsageMapDef();
protected abstract int defineOffsetIndexEntryMask();
protected abstract int defineSizeIndexDefinition();
protected abstract int defineSizeColumnHeader();
protected abstract int defineSizeRowLocation();
protected abstract int defineSizeLongValueDef();
protected abstract int defineSizeTdefBlock();
protected abstract int defineSizeColumnDefBlock();
protected abstract int defineSizeIndexEntryMask();
protected abstract int definePagesPerUsageMapPage();
protected abstract Charset defineCharset();
private static final class Jet4Format extends JetFormat {
protected int definePageSize() { return 4096; }
protected int defineMaxRowSize() { return PAGE_SIZE - 18; }
protected int defineOffsetNextTableDefPage() { return 4; }
protected int defineOffsetNumRows() { return 16; }
protected int defineOffsetTableType() { return 40; }
protected int defineOffsetNumCols() { return 45; }
protected int defineOffsetNumIndexes() { return 47; }
protected int defineOffsetOwnedPages() { return 55; }
protected int defineOffsetFreeSpacePages() { return 59; }
protected int defineOffsetIndexDefBlock() { return 63; }
protected int defineOffsetColumnType() { return 0; }
protected int defineOffsetColumnNumber() { return 5; }
protected int defineOffsetColumnPrecision() { return 11; }
protected int defineOffsetColumnScale() { return 12; }
protected int defineOffsetColumnVariable() { return 15; }
protected int defineOffsetColumnCompressedUnicode() { return 16; }
protected int defineOffsetColumnLength() { return 23; }
protected int defineOffsetTableDefLocation() { return 4; }
protected int defineOffsetNumRowsOnPage() { return 12; }
protected int defineOffsetRowLocationBlock() { return 16; }
protected int defineOffsetRowStart() { return 14; }
protected int defineOffsetMapStart() { return 5; }
protected int defineOffsetUsageMapPageData() { return 4; }
protected int defineOffsetReferenceMapPageNumbers() { return 1; }
protected int defineOffsetFreeSpace() { return 2; }
protected int defineOffsetDataRowLocationBlock() { return 14; }
protected int defineOffsetNumRowsOnDataPage() { return 12; }
protected int defineOffsetLvalRowLocationBlock() { return 10; }
protected int defineOffsetUsedPagesUsageMapDef() { return 4027; }
protected int defineOffsetFreePagesUsageMapDef() { return 3958; }
protected int defineOffsetIndexEntryMask() { return 27; }
protected int defineSizeIndexDefinition() { return 12; }
protected int defineSizeColumnHeader() { return 25; }
protected int defineSizeRowLocation() { return 2; }
protected int defineSizeLongValueDef() { return 12; }
protected int defineSizeTdefBlock() { return 63; }
protected int defineSizeColumnDefBlock() { return 25; }
protected int defineSizeIndexEntryMask() { return 453; }
protected int definePagesPerUsageMapPage() { return 4092 * 8; }
protected Charset defineCharset() { return Charset.forName("UTF-16LE"); }
}
}

+ 88
- 0
src/java/com/healthmarketscience/jackcess/NullMask.java View File

@@ -0,0 +1,88 @@
/*
Copyright (c) 2005 Health Market Science, Inc.

This library is free software; you can redistribute it and/or
modify it under the terms of the GNU Lesser General Public
License as published by the Free Software Foundation; either
version 2.1 of the License, or (at your option) any later version.

This library is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
Lesser General Public License for more details.

You should have received a copy of the GNU Lesser General Public
License along with this library; if not, write to the Free Software
Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307
USA

You can contact Health Market Science at info@healthmarketscience.com
or at the following address:

Health Market Science
2700 Horizon Drive
Suite 200
King of Prussia, PA 19406
*/

package com.healthmarketscience.jackcess;

import java.nio.ByteBuffer;

/**
* Bitmask that indicates whether or not each column in a row is null. Also
* holds values of boolean columns.
* @author Tim McCune
*/
public class NullMask {
/** The actual bitmask */
private byte[] _mask;
/**
* @param columnCount Number of columns in the row that this mask will be
* used for
*/
public NullMask(int columnCount) {
_mask = new byte[(columnCount + 7) / 8];
for (int i = 0; i < _mask.length; i++) {
_mask[i] = (byte) 0xff;
}
for (int i = columnCount; i < _mask.length * 8; i++) {
markNull(i);
}
}
/**
* Read a mask in from a buffer
*/
public void read(ByteBuffer buffer) {
buffer.get(_mask);
}
public ByteBuffer wrap() {
return ByteBuffer.wrap(_mask);
}
/**
* @param columnNumber 0-based column number in this mask's row
* @return Whether or not the value for that column is null. For boolean
* columns, returns the actual value of the column.
*/
public boolean isNull(int columnNumber) {
return (_mask[columnNumber / 8] & (byte) (1 << (columnNumber % 8))) == 0;
}
public void markNull(int columnNumber) {
int maskIndex = columnNumber / 8;
_mask[maskIndex] = (byte) (_mask[maskIndex] & (byte) ~(1 << (columnNumber % 8)));
}
/**
* @return Size in bytes of this mask
*/
public int byteSize() {
return _mask.length;
}
}

+ 135
- 0
src/java/com/healthmarketscience/jackcess/PageChannel.java View File

@@ -0,0 +1,135 @@
/*
Copyright (c) 2005 Health Market Science, Inc.

This library is free software; you can redistribute it and/or
modify it under the terms of the GNU Lesser General Public
License as published by the Free Software Foundation; either
version 2.1 of the License, or (at your option) any later version.

This library is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
Lesser General Public License for more details.

You should have received a copy of the GNU Lesser General Public
License along with this library; if not, write to the Free Software
Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307
USA

You can contact Health Market Science at info@healthmarketscience.com
or at the following address:

Health Market Science
2700 Horizon Drive
Suite 200
King of Prussia, PA 19406
*/

package com.healthmarketscience.jackcess;

import java.io.IOException;
import java.nio.ByteBuffer;
import java.nio.ByteOrder;
import java.nio.channels.Channel;
import java.nio.channels.FileChannel;
import org.apache.commons.logging.Log;
import org.apache.commons.logging.LogFactory;

/**
* Reads and writes individual pages in a database file
* @author Tim McCune
*/
public class PageChannel implements Channel {
private static final Log LOG = LogFactory.getLog(PageChannel.class);
/** Global usage map always lives on page 1 */
private static final int PAGE_GLOBAL_USAGE_MAP = 1;
/** Channel containing the database */
private FileChannel _channel;
/** Format of the database in the channel */
private JetFormat _format;
/** Tracks free pages in the database. */
private UsageMap _globalUsageMap;
/**
* @param channel Channel containing the database
* @param format Format of the database in the channel
*/
public PageChannel(FileChannel channel, JetFormat format) throws IOException {
_channel = channel;
_format = format;
//Null check only exists for unit tests. Channel should never normally be null.
if (channel != null) {
_globalUsageMap = UsageMap.read(this, PAGE_GLOBAL_USAGE_MAP, (byte) 0, format);
}
}
/**
* @param buffer Buffer to read the page into
* @param pageNumber Number of the page to read in (starting at 0)
* @return True if the page was successfully read into the buffer, false if
* that page doesn't exist.
*/
public boolean readPage(ByteBuffer buffer, int pageNumber) throws IOException {
if (LOG.isDebugEnabled()) {
LOG.debug("Reading in page " + Integer.toHexString(pageNumber));
}
buffer.clear();
boolean rtn = _channel.read(buffer, (long) pageNumber * (long) _format.PAGE_SIZE) != -1;
buffer.flip();
return rtn;
}
/**
* Write a page to disk
* @param page Page to write
* @param pageNumber Page number to write the page to
*/
public void writePage(ByteBuffer page, int pageNumber) throws IOException {
page.rewind();
_channel.write(page, (long) pageNumber * (long) _format.PAGE_SIZE);
_channel.force(true);
}
/**
* Write a page to disk as a new page, appending it to the database
* @param page Page to write
* @return Page number at which the page was written
*/
public int writeNewPage(ByteBuffer page) throws IOException {
long size = _channel.size();
page.rewind();
_channel.write(page, size);
int pageNumber = (int) (size / _format.PAGE_SIZE);
_globalUsageMap.removePageNumber(pageNumber); //force is done here
return pageNumber;
}
/**
* @return Number of pages in the database
*/
public int getPageCount() throws IOException {
return (int) (_channel.size() / _format.PAGE_SIZE);
}
/**
* @return A newly-allocated buffer that can be passed to readPage
*/
public ByteBuffer createPageBuffer() {
ByteBuffer rtn = ByteBuffer.allocate(_format.PAGE_SIZE);
rtn.order(ByteOrder.LITTLE_ENDIAN);
return rtn;
}
public void close() throws IOException {
_channel.force(true);
_channel.close();
}
public boolean isOpen() {
return _channel.isOpen();
}
}

+ 43
- 0
src/java/com/healthmarketscience/jackcess/PageTypes.java View File

@@ -0,0 +1,43 @@
/*
Copyright (c) 2005 Health Market Science, Inc.

This library is free software; you can redistribute it and/or
modify it under the terms of the GNU Lesser General Public
License as published by the Free Software Foundation; either
version 2.1 of the License, or (at your option) any later version.

This library is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
Lesser General Public License for more details.

You should have received a copy of the GNU Lesser General Public
License along with this library; if not, write to the Free Software
Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307
USA

You can contact Health Market Science at info@healthmarketscience.com
or at the following address:

Health Market Science
2700 Horizon Drive
Suite 200
King of Prussia, PA 19406
*/

package com.healthmarketscience.jackcess;

/**
* Codes for page types
* @author Tim McCune
*/
public interface PageTypes {
/** Data page */
public static final byte DATA = 0x1;
/** Table definition page */
public static final byte TABLE_DEF = 0x2;
/** Table usage map page */
public static final byte USAGE_MAP = 0x5;
}

+ 118
- 0
src/java/com/healthmarketscience/jackcess/ReferenceUsageMap.java View File

@@ -0,0 +1,118 @@
/*
Copyright (c) 2005 Health Market Science, Inc.

This library is free software; you can redistribute it and/or
modify it under the terms of the GNU Lesser General Public
License as published by the Free Software Foundation; either
version 2.1 of the License, or (at your option) any later version.

This library is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
Lesser General Public License for more details.

You should have received a copy of the GNU Lesser General Public
License along with this library; if not, write to the Free Software
Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307
USA

You can contact Health Market Science at info@healthmarketscience.com
or at the following address:

Health Market Science
2700 Horizon Drive
Suite 200
King of Prussia, PA 19406
*/

package com.healthmarketscience.jackcess;

import java.io.IOException;
import java.nio.ByteBuffer;

/**
* Usage map whose map is written across one or more entire separate pages of
* page type USAGE_MAP. This type of map can contain 32736 pages per reference
* page, and a maximum of 16 reference map pages for a total maximum of 523776
* pages (2 GB).
* @author Tim McCune
*/
public class ReferenceUsageMap extends UsageMap {
/** Buffer that contains the current reference map page */
private ByteBuffer _mapPageBuffer;
/** Page number of the reference map page that was last read */
private int _mapPageNum;
/**
* @param pageChannel Used to read in pages
* @param dataBuffer Buffer that contains this map's declaration
* @param pageNum Page number that this usage map is contained in
* @param format Format of the database that contains this usage map
* @param rowStart Offset at which the declaration starts in the buffer
*/
public ReferenceUsageMap(PageChannel pageChannel, ByteBuffer dataBuffer,
int pageNum, JetFormat format, short rowStart)
throws IOException
{
super(pageChannel, dataBuffer, pageNum, format, rowStart);
_mapPageBuffer = pageChannel.createPageBuffer();
for (int i = 0; i < 17; i++) {
_mapPageNum = dataBuffer.getInt(getRowStart() +
format.OFFSET_REFERENCE_MAP_PAGE_NUMBERS + (4 * i));
if (_mapPageNum > 0) {
pageChannel.readPage(_mapPageBuffer, _mapPageNum);
byte pageType = _mapPageBuffer.get();
if (pageType != PageTypes.USAGE_MAP) {
throw new IOException("Looking for usage map at page " + _mapPageNum +
", but page type is " + pageType);
}
_mapPageBuffer.position(format.OFFSET_USAGE_MAP_PAGE_DATA);
setStartOffset(_mapPageBuffer.position());
processMap(_mapPageBuffer, i, 0);
}
}
}
//Javadoc copied from UsageMap
protected void addOrRemovePageNumber(final int pageNumber, boolean add)
throws IOException
{
int pageIndex = (int) Math.floor(pageNumber / getFormat().PAGES_PER_USAGE_MAP_PAGE);
int mapPageNumber = getDataBuffer().getInt(calculateMapPagePointerOffset(pageIndex));
if (mapPageNumber > 0) {
if (_mapPageNum != mapPageNumber) {
//Need to read in the map page
getPageChannel().readPage(_mapPageBuffer, mapPageNumber);
_mapPageNum = mapPageNumber;
}
} else {
//Need to create a new usage map page
createNewUsageMapPage(pageIndex);
}
updateMap(pageNumber, pageNumber - (getFormat().PAGES_PER_USAGE_MAP_PAGE * pageIndex),
1 << ((pageNumber - (getFormat().PAGES_PER_USAGE_MAP_PAGE * pageIndex)) % 8),
_mapPageBuffer, add);
getPageChannel().writePage(_mapPageBuffer, _mapPageNum);
}
/**
* Create a new usage map page and update the map declaration with a pointer
* to it.
* @param pageIndex Index of the page reference within the map declaration
*/
private void createNewUsageMapPage(int pageIndex) throws IOException {
_mapPageBuffer = getPageChannel().createPageBuffer();
_mapPageBuffer.put(PageTypes.USAGE_MAP);
_mapPageBuffer.put((byte) 0x01); //Unknown
_mapPageBuffer.putShort((short) 0); //Unknown
_mapPageNum = getPageChannel().writeNewPage(_mapPageBuffer);
getDataBuffer().putInt(calculateMapPagePointerOffset(pageIndex), _mapPageNum);
getPageChannel().writePage(getDataBuffer(), getDataPageNumber());
}
private int calculateMapPagePointerOffset(int pageIndex) {
return getRowStart() + getFormat().OFFSET_REFERENCE_MAP_PAGE_NUMBERS + (pageIndex * 4);
}
}

+ 559
- 0
src/java/com/healthmarketscience/jackcess/Table.java View File

@@ -0,0 +1,559 @@
/*
Copyright (c) 2005 Health Market Science, Inc.

This library is free software; you can redistribute it and/or
modify it under the terms of the GNU Lesser General Public
License as published by the Free Software Foundation; either
version 2.1 of the License, or (at your option) any later version.

This library is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
Lesser General Public License for more details.

You should have received a copy of the GNU Lesser General Public
License along with this library; if not, write to the Free Software
Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307
USA

You can contact Health Market Science at info@healthmarketscience.com
or at the following address:

Health Market Science
2700 Horizon Drive
Suite 200
King of Prussia, PA 19406
*/

package com.healthmarketscience.jackcess;

import java.io.IOException;
import java.nio.ByteBuffer;
import java.util.ArrayList;
import java.util.Arrays;
import java.util.Collection;
import java.util.Collections;
import java.util.Iterator;
import java.util.LinkedHashMap;
import java.util.List;
import java.util.Map;
import org.apache.commons.logging.Log;
import org.apache.commons.logging.LogFactory;

/**
* A single database table
* @author Tim McCune
*/
public class Table {
private static final Log LOG = LogFactory.getLog(Table.class);
/** Table type code for system tables */
public static final byte TYPE_SYSTEM = 0x53;
/** Table type code for user tables */
public static final byte TYPE_USER = 0x4e;
/** Buffer used for reading the table */
private ByteBuffer _buffer;
/** Type of the table (either TYPE_SYSTEM or TYPE_USER) */
private byte _tableType;
/** Number of the current row in a data page */
private int _currentRowInPage;
/** Number of indexes on the table */
private int _indexCount;
/** Offset index in the buffer where the last row read started */
private short _lastRowStart;
/** Number of rows in the table */
private int _rowCount;
private int _tableDefPageNumber;
/** Number of rows left to be read on the current page */
private short _rowsLeftOnPage = 0;
/** Offset index in the buffer of the start of the current row */
private short _rowStart;
/** Number of columns in the table */
private short _columnCount;
/** Format of the database that contains this table */
private JetFormat _format;
/** List of columns in this table (Column) */
private List _columns = new ArrayList();
/** List of indexes on this table (Index) */
private List _indexes = new ArrayList();
/** Used to read in pages */
private PageChannel _pageChannel;
/** Usage map of pages that this table owns */
private UsageMap _ownedPages;
/** Usage map of pages that this table owns with free space on them */
private UsageMap _freeSpacePages;
/**
* Only used by unit tests
*/
Table() throws IOException {
_pageChannel = new PageChannel(null, JetFormat.VERSION_4);
}
/**
* @param buffer Buffer to read the table with
* @param pageChannel Page channel to get database pages from
* @param format Format of the database that contains this table
* @param pageNumber Page number of the table definition
*/
protected Table(ByteBuffer buffer, PageChannel pageChannel, JetFormat format, int pageNumber)
throws IOException
{
_buffer = buffer;
_pageChannel = pageChannel;
_format = format;
_tableDefPageNumber = pageNumber;
int nextPage;
do {
readPage();
nextPage = _buffer.getInt(_format.OFFSET_NEXT_TABLE_DEF_PAGE);
} while (nextPage > 0);
}
/**
* @return All of the columns in this table (unmodifiable List)
*/
public List getColumns() {
return Collections.unmodifiableList(_columns);
}
/**
* Only called by unit tests
*/
void setColumns(List columns) {
_columns = columns;
}
/**
* @return All of the Indexes on this table (unmodifiable List)
*/
public List getIndexes() {
return Collections.unmodifiableList(_indexes);
}
/**
* After calling this method, getNextRow will return the first row in the table
*/
public void reset() {
_rowsLeftOnPage = 0;
_ownedPages.reset();
}
/**
* @return The next row in this table (Column name (String) -> Column value (Object))
*/
public Map getNextRow() throws IOException {
return getNextRow(null);
}
/**
* @param columnNames Only column names in this collection will be returned
* @return The next row in this table (Column name (String) -> Column value (Object))
*/
public Map getNextRow(Collection columnNames) throws IOException {
if (!positionAtNextRow()) {
return null;
}
if (LOG.isDebugEnabled()) {
LOG.debug("Data block at position " + Integer.toHexString(_buffer.position()) +
":\n" + ByteUtil.toHexString(_buffer, _buffer.position(),
_buffer.limit() - _buffer.position()));
}
short columnCount = _buffer.getShort(); //Number of columns in this table
Map rtn = new LinkedHashMap(columnCount);
NullMask nullMask = new NullMask(columnCount);
_buffer.position(_buffer.limit() - nullMask.byteSize()); //Null mask at end
nullMask.read(_buffer);
_buffer.position(_buffer.limit() - nullMask.byteSize() - 2);
short varColumnCount = _buffer.getShort(); //Number of variable length columns
byte[][] varColumnData = new byte[varColumnCount][]; //Holds variable length column data
//Read in the offsets of each of the variable length columns
short[] varColumnOffsets = new short[varColumnCount];
_buffer.position(_buffer.position() - 2 - (varColumnCount * 2) - 2);
short lastVarColumnStart = _buffer.getShort();
for (short i = 0; i < varColumnCount; i++) {
varColumnOffsets[i] = _buffer.getShort();
}
//Read in the actual data for each of the variable length columns
for (short i = 0; i < varColumnCount; i++) {
_buffer.position(_rowStart + varColumnOffsets[i]);
varColumnData[i] = new byte[lastVarColumnStart - varColumnOffsets[i]];
_buffer.get(varColumnData[i]);
lastVarColumnStart = varColumnOffsets[i];
}
int columnNumber = 0;
int varColumnDataIndex = varColumnCount - 1;
_buffer.position(_rowStart + 2); //Move back to the front of the buffer
//Now read in the fixed length columns and populate the columnData array
//with the combination of fixed length and variable length data.
byte[] columnData;
for (Iterator iter = _columns.iterator(); iter.hasNext(); columnNumber++) {
Column column = (Column) iter.next();
boolean isNull = nullMask.isNull(columnNumber);
Object value = null;
if (column.getType() == DataTypes.BOOLEAN) {
value = new Boolean(!isNull); //Boolean values are stored in the null mask
} else if (!isNull) {
if (!column.isVariableLength()) {
//Read in fixed length column data
columnData = new byte[column.size()];
_buffer.get(columnData);
} else {
//Refer to already-read-in variable length data
columnData = varColumnData[varColumnDataIndex--];
}
if (columnNames == null || columnNames.contains(column.getName())) {
//Add the value if we are interested in it.
value = column.read(columnData);
}
}
rtn.put(column.getName(), value);
}
return rtn;
}
/**
* Position the buffer at the next row in the table
* @return True if another row was found, false if there are no more rows
*/
private boolean positionAtNextRow() throws IOException {
if (_rowsLeftOnPage == 0) {
do {
if (!_ownedPages.getNextPage(_buffer)) {
//No more owned pages. No more rows.
return false;
}
} while (_buffer.get() != PageTypes.DATA); //Only interested in data pages
_rowsLeftOnPage = _buffer.getShort(_format.OFFSET_NUM_ROWS_ON_DATA_PAGE);
_currentRowInPage = 0;
_lastRowStart = (short) _format.PAGE_SIZE;
}
_rowStart = _buffer.getShort(_format.OFFSET_DATA_ROW_LOCATION_BLOCK +
_currentRowInPage * _format.SIZE_ROW_LOCATION);
// XXX - Handle overflow pages and deleted rows.
_buffer.position(_rowStart);
_buffer.limit(_lastRowStart);
_rowsLeftOnPage--;
_currentRowInPage++;
_lastRowStart = _rowStart;
return true;
}

/**
* Read the table definition
*/
private void readPage() throws IOException {
if (LOG.isDebugEnabled()) {
_buffer.rewind();
LOG.debug("Table def block:\n" + ByteUtil.toHexString(_buffer,
_format.SIZE_TDEF_BLOCK));
}
_rowCount = _buffer.getInt(_format.OFFSET_NUM_ROWS);
_tableType = _buffer.get(_format.OFFSET_TABLE_TYPE);
_columnCount = _buffer.getShort(_format.OFFSET_NUM_COLS);
_indexCount = _buffer.getInt(_format.OFFSET_NUM_INDEXES);
byte rowNum = _buffer.get(_format.OFFSET_OWNED_PAGES);
int pageNum = ByteUtil.get3ByteInt(_buffer, _format.OFFSET_OWNED_PAGES + 1);
_ownedPages = UsageMap.read(_pageChannel, pageNum, rowNum, _format);
rowNum = _buffer.get(_format.OFFSET_FREE_SPACE_PAGES);
pageNum = ByteUtil.get3ByteInt(_buffer, _format.OFFSET_FREE_SPACE_PAGES + 1);
_freeSpacePages = UsageMap.read(_pageChannel, pageNum, rowNum, _format);
for (int i = 0; i < _indexCount; i++) {
Index index = new Index(_tableDefPageNumber, _pageChannel, _format);
_indexes.add(index);
index.setRowCount(_buffer.getInt(_format.OFFSET_INDEX_DEF_BLOCK +
i * _format.SIZE_INDEX_DEFINITION + 4));
}
int offset = _format.OFFSET_INDEX_DEF_BLOCK +
_indexCount * _format.SIZE_INDEX_DEFINITION;
Column column;
for (int i = 0; i < _columnCount; i++) {
column = new Column(_buffer,
offset + i * _format.SIZE_COLUMN_HEADER, _pageChannel, _format);
_columns.add(column);
}
offset += _columnCount * _format.SIZE_COLUMN_HEADER;
for (int i = 0; i < _columnCount; i++) {
column = (Column) _columns.get(i);
short nameLength = _buffer.getShort(offset);
offset += 2;
byte[] nameBytes = new byte[nameLength];
_buffer.position(offset);
_buffer.get(nameBytes, 0, (int) nameLength);
column.setName(_format.CHARSET.decode(ByteBuffer.wrap(nameBytes)).toString());
offset += nameLength;
}
Collections.sort(_columns);
for (int i = 0; i < _indexCount; i++) {
_buffer.getInt(); //Forward past Unknown
((Index) _indexes.get(i)).read(_buffer, _columns);
}
for (int i = 0; i < _indexCount; i++) {
_buffer.getInt(); //Forward past Unknown
((Index) _indexes.get(i)).setIndexNumber(_buffer.getInt());
_buffer.position(_buffer.position() + 20);
}
Collections.sort(_indexes);
for (int i = 0; i < _indexCount; i++) {
byte[] nameBytes = new byte[_buffer.getShort()];
_buffer.get(nameBytes);
((Index) _indexes.get(i)).setName(_format.CHARSET.decode(ByteBuffer.wrap(
nameBytes)).toString());
}
}
/**
* Add a single row to this table and write it to disk
*/
public void addRow(Object[] row) throws IOException {
List rows = new ArrayList(1);
rows.add(row);
addRows(rows);
}
/**
* Add multiple rows to this table, only writing to disk after all
* rows have been written, and every time a data page is filled. This
* is much more efficient than calling <code>addRow</code> multiple times.
* @param rows List of Object[] row values
*/
public void addRows(List rows) throws IOException {
ByteBuffer dataPage = _pageChannel.createPageBuffer();
ByteBuffer[] rowData = new ByteBuffer[rows.size()];
Iterator iter = rows.iterator();
for (int i = 0; iter.hasNext(); i++) {
rowData[i] = createRow((Object[]) iter.next());
}
List pageNumbers = _ownedPages.getPageNumbers();
int pageNumber;
int rowSize;
if (pageNumbers.size() == 0) {
//No data pages exist. Create a new one.
pageNumber = newDataPage(dataPage, rowData[0]);
} else {
//Get the last data page.
//Not bothering to check other pages for free space.
pageNumber = ((Integer) pageNumbers.get(pageNumbers.size() - 1)).intValue();
_pageChannel.readPage(dataPage, pageNumber);
}
for (int i = 0; i < rowData.length; i++) {
rowSize = rowData[i].limit();
short freeSpaceInPage = dataPage.getShort(_format.OFFSET_FREE_SPACE);
if (freeSpaceInPage < (rowSize + _format.SIZE_ROW_LOCATION)) {
//Last data page is full. Create a new one.
if (rowSize + _format.SIZE_ROW_LOCATION > _format.MAX_ROW_SIZE) {
throw new IOException("Row size " + rowSize + " is too large");
}
_pageChannel.writePage(dataPage, pageNumber);
dataPage.clear();
pageNumber = newDataPage(dataPage, rowData[i]);
_freeSpacePages.removePageNumber(pageNumber);
freeSpaceInPage = dataPage.getShort(_format.OFFSET_FREE_SPACE);
}
//Decrease free space record.
dataPage.putShort(_format.OFFSET_FREE_SPACE, (short) (freeSpaceInPage -
rowSize - _format.SIZE_ROW_LOCATION));
//Increment row count record.
short rowCount = dataPage.getShort(_format.OFFSET_NUM_ROWS_ON_DATA_PAGE);
dataPage.putShort(_format.OFFSET_NUM_ROWS_ON_DATA_PAGE, (short) (rowCount + 1));
short rowLocation = (short) _format.PAGE_SIZE;
if (rowCount > 0) {
rowLocation = dataPage.getShort(_format.OFFSET_DATA_ROW_LOCATION_BLOCK +
(rowCount - 1) * _format.SIZE_ROW_LOCATION);
}
rowLocation -= rowSize;
dataPage.putShort(_format.OFFSET_DATA_ROW_LOCATION_BLOCK +
rowCount * _format.SIZE_ROW_LOCATION, rowLocation);
dataPage.position(rowLocation);
dataPage.put(rowData[i]);
iter = _indexes.iterator();
while (iter.hasNext()) {
Index index = (Index) iter.next();
index.addRow((Object[]) rows.get(i), pageNumber, (byte) rowCount);
}
}
_pageChannel.writePage(dataPage, pageNumber);
//Update tdef page
ByteBuffer tdefPage = _pageChannel.createPageBuffer();
_pageChannel.readPage(tdefPage, _tableDefPageNumber);
tdefPage.putInt(_format.OFFSET_NUM_ROWS, ++_rowCount);
iter = _indexes.iterator();
for (int i = 0; i < _indexes.size(); i++) {
tdefPage.putInt(_format.OFFSET_INDEX_DEF_BLOCK +
i * _format.SIZE_INDEX_DEFINITION + 4, _rowCount);
Index index = (Index) iter.next();
index.update();
}
_pageChannel.writePage(tdefPage, _tableDefPageNumber);
}
/**
* Create a new data page
* @return Page number of the new page
*/
private int newDataPage(ByteBuffer dataPage, ByteBuffer rowData) throws IOException {
if (LOG.isDebugEnabled()) {
LOG.debug("Creating new data page");
}
dataPage.put(PageTypes.DATA); //Page type
dataPage.put((byte) 1); //Unknown
dataPage.putShort((short) (_format.PAGE_SIZE - _format.OFFSET_DATA_ROW_LOCATION_BLOCK -
(rowData.limit() - 1) - _format.SIZE_ROW_LOCATION)); //Free space in this page
dataPage.putInt(_tableDefPageNumber); //Page pointer to table definition
dataPage.putInt(0); //Unknown
dataPage.putInt(0); //Number of records on this page
int pageNumber = _pageChannel.writeNewPage(dataPage);
_ownedPages.addPageNumber(pageNumber);
_freeSpacePages.addPageNumber(pageNumber);
return pageNumber;
}
/**
* Serialize a row of Objects into a byte buffer
*/
ByteBuffer createRow(Object[] rowArray) throws IOException {
ByteBuffer buffer = _pageChannel.createPageBuffer();
buffer.putShort((short) _columns.size());
NullMask nullMask = new NullMask(_columns.size());
Iterator iter;
int index = 0;
Column col;
List row = new ArrayList(Arrays.asList(rowArray));
//Append null for arrays that are too small
for (int i = rowArray.length; i < _columnCount; i++) {
row.add(null);
}
for (iter = _columns.iterator(); iter.hasNext() && index < row.size(); index++) {
col = (Column) iter.next();
if (!col.isVariableLength()) {
//Fixed length column data comes first
if (row.get(index) != null) {
buffer.put(col.write(row.get(index)));
}
}
if (col.getType() == DataTypes.BOOLEAN) {
if (row.get(index) != null) {
if (!((Boolean) row.get(index)).booleanValue()) {
//Booleans are stored in the null mask
nullMask.markNull(index);
}
}
} else if (row.get(index) == null) {
nullMask.markNull(index);
}
}
int varLengthCount = Column.countVariableLength(_columns);
short[] varColumnOffsets = new short[varLengthCount];
index = 0;
int varColumnOffsetsIndex = 0;
//Now write out variable length column data
for (iter = _columns.iterator(); iter.hasNext() && index < row.size(); index++) {
col = (Column) iter.next();
short offset = (short) buffer.position();
if (col.isVariableLength()) {
if (row.get(index) != null) {
buffer.put(col.write(row.get(index)));
}
varColumnOffsets[varColumnOffsetsIndex++] = offset;
}
}
buffer.putShort((short) buffer.position()); //EOD marker
//Now write out variable length offsets
//Offsets are stored in reverse order
for (int i = varColumnOffsets.length - 1; i >= 0; i--) {
buffer.putShort(varColumnOffsets[i]);
}
buffer.putShort((short) varLengthCount); //Number of var length columns
buffer.put(nullMask.wrap()); //Null mask
buffer.limit(buffer.position());
buffer.flip();
if (LOG.isDebugEnabled()) {
LOG.debug("Creating new data block:\n" + ByteUtil.toHexString(buffer, buffer.limit()));
}
return buffer;
}
public String toString() {
StringBuffer rtn = new StringBuffer();
rtn.append("Type: " + _tableType);
rtn.append("\nRow count: " + _rowCount);
rtn.append("\nColumn count: " + _columnCount);
rtn.append("\nIndex count: " + _indexCount);
rtn.append("\nColumns:\n");
Iterator iter = _columns.iterator();
while (iter.hasNext()) {
rtn.append(iter.next().toString());
}
rtn.append("\nIndexes:\n");
iter = _indexes.iterator();
while (iter.hasNext()) {
rtn.append(iter.next().toString());
}
rtn.append("\nOwned pages: " + _ownedPages + "\n");
return rtn.toString();
}
/**
* @return A simple String representation of the entire table in tab-delimited format
*/
public String display() throws IOException {
return display(Long.MAX_VALUE);
}
/**
* @param limit Maximum number of rows to display
* @return A simple String representation of the entire table in tab-delimited format
*/
public String display(long limit) throws IOException {
reset();
StringBuffer rtn = new StringBuffer();
Iterator iter = _columns.iterator();
while (iter.hasNext()) {
Column col = (Column) iter.next();
rtn.append(col.getName());
if (iter.hasNext()) {
rtn.append("\t");
}
}
rtn.append("\n");
Map row;
int rowCount = 0;
while ((rowCount++ < limit) && (row = getNextRow()) != null) {
iter = row.values().iterator();
while (iter.hasNext()) {
Object obj = iter.next();
if (obj instanceof byte[]) {
byte[] b = (byte[]) obj;
rtn.append(ByteUtil.toHexString(ByteBuffer.wrap(b), b.length));
//This block can be used to easily dump a binary column to a file
/*java.io.File f = java.io.File.createTempFile("ole", ".bin");
java.io.FileOutputStream out = new java.io.FileOutputStream(f);
out.write(b);
out.flush();
out.close();*/
} else {
rtn.append(String.valueOf(obj));
}
if (iter.hasNext()) {
rtn.append("\t");
}
}
rtn.append("\n");
}
return rtn.toString();
}
}

+ 239
- 0
src/java/com/healthmarketscience/jackcess/UsageMap.java View File

@@ -0,0 +1,239 @@
/*
Copyright (c) 2005 Health Market Science, Inc.

This library is free software; you can redistribute it and/or
modify it under the terms of the GNU Lesser General Public
License as published by the Free Software Foundation; either
version 2.1 of the License, or (at your option) any later version.

This library is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
Lesser General Public License for more details.

You should have received a copy of the GNU Lesser General Public
License along with this library; if not, write to the Free Software
Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307
USA

You can contact Health Market Science at info@healthmarketscience.com
or at the following address:

Health Market Science
2700 Horizon Drive
Suite 200
King of Prussia, PA 19406
*/

package com.healthmarketscience.jackcess;

import java.io.IOException;
import java.nio.ByteBuffer;
import java.util.ArrayList;
import java.util.List;
import org.apache.commons.logging.Log;
import org.apache.commons.logging.LogFactory;

/**
* Describes which database pages a particular table uses
* @author Tim McCune
*/
public abstract class UsageMap {
private static final Log LOG = LogFactory.getLog(UsageMap.class);
/** Inline map type */
public static final byte MAP_TYPE_INLINE = 0x0;
/** Reference map type, for maps that are too large to fit inline */
public static final byte MAP_TYPE_REFERENCE = 0x1;
/** Index of the current page, incremented after calling getNextPage */
private int _currentPageIndex = 0;
/** Page number of the map declaration */
private int _dataPageNum;
/** Offset of the data page at which the usage map data starts */
private int _startOffset;
/** Offset of the data page at which the usage map declaration starts */
private short _rowStart;
/** Format of the database that contains this usage map */
private JetFormat _format;
/** List of page numbers used (Integer) */
private List _pageNumbers = new ArrayList();
/** Buffer that contains the usage map declaration page */
private ByteBuffer _dataBuffer;
/** Used to read in pages */
private PageChannel _pageChannel;
/**
* @param pageChannel Used to read in pages
* @param pageNum Page number that this usage map is contained in
* @param rowNum Number of the row on the page that contains this usage map
* @param format Format of the database that contains this usage map
* @return Either an InlineUsageMap or a ReferenceUsageMap, depending on which
* type of map is found
*/
public static UsageMap read(PageChannel pageChannel, int pageNum, byte rowNum, JetFormat format)
throws IOException
{
ByteBuffer dataBuffer = pageChannel.createPageBuffer();
pageChannel.readPage(dataBuffer, pageNum);
short rowStart = dataBuffer.getShort(format.OFFSET_ROW_START + 2 * rowNum);
int rowEnd;
if (rowNum == 0) {
rowEnd = format.PAGE_SIZE - 1;
} else {
rowEnd = (dataBuffer.getShort(format.OFFSET_ROW_START + (rowNum - 1) * 2) & 0x0FFF) - 1;
}
dataBuffer.limit(rowEnd + 1);
byte mapType = dataBuffer.get(rowStart);
UsageMap rtn;
if (mapType == MAP_TYPE_INLINE) {
rtn = new InlineUsageMap(pageChannel, dataBuffer, pageNum, format, rowStart);
} else if (mapType == MAP_TYPE_REFERENCE) {
rtn = new ReferenceUsageMap(pageChannel, dataBuffer, pageNum, format, rowStart);
} else {
throw new IOException("Unrecognized map type: " + mapType);
}
return rtn;
}
/**
* @param pageChannel Used to read in pages
* @param dataBuffer Buffer that contains this map's declaration
* @param pageNum Page number that this usage map is contained in
* @param format Format of the database that contains this usage map
* @param rowStart Offset at which the declaration starts in the buffer
*/
public UsageMap(PageChannel pageChannel, ByteBuffer dataBuffer, int pageNum,
JetFormat format, short rowStart)
throws IOException
{
_pageChannel = pageChannel;
_dataBuffer = dataBuffer;
_dataPageNum = pageNum;
_format = format;
_rowStart = rowStart;
_dataBuffer.position((int) _rowStart + format.OFFSET_MAP_START);
_startOffset = _dataBuffer.position();
if (LOG.isDebugEnabled()) {
LOG.debug("Usage map block:\n" + ByteUtil.toHexString(_dataBuffer, _rowStart,
dataBuffer.limit() - _rowStart));
}
}
protected short getRowStart() {
return _rowStart;
}
public List getPageNumbers() {
return _pageNumbers;
}
protected void setStartOffset(int startOffset) {
_startOffset = startOffset;
}
protected int getStartOffset() {
return _startOffset;
}
protected ByteBuffer getDataBuffer() {
return _dataBuffer;
}
protected int getDataPageNumber() {
return _dataPageNum;
}
protected PageChannel getPageChannel() {
return _pageChannel;
}
protected JetFormat getFormat() {
return _format;
}
/**
* After calling this method, getNextPage will return the first page in the map
*/
public void reset() {
_currentPageIndex = 0;
}
/**
* @param buffer Buffer to read the next page into
* @return Whether or not there was another page to read
*/
public boolean getNextPage(ByteBuffer buffer) throws IOException {
if (_pageNumbers.size() > _currentPageIndex) {
Integer pageNumber = (Integer) _pageNumbers.get(_currentPageIndex++);
_pageChannel.readPage(buffer, pageNumber.intValue());
return true;
} else {
return false;
}
}
/**
* Read in the page numbers in this inline map
*/
protected void processMap(ByteBuffer buffer, int pageIndex, int startPage) {
int byteCount = 0;
while (buffer.hasRemaining()) {
byte b = buffer.get();
for (int i = 0; i < 8; i++) {
if ((b & (1 << i)) != 0) {
Integer pageNumber = new Integer((startPage + byteCount * 8 + i) +
(pageIndex * _format.PAGES_PER_USAGE_MAP_PAGE));
_pageNumbers.add(pageNumber);
}
}
byteCount++;
}
}
/**
* Add a page number to this usage map
*/
public void addPageNumber(int pageNumber) throws IOException {
//Sanity check, only on in debug mode for performance considerations
if (LOG.isDebugEnabled() && _pageNumbers.contains(new Integer(pageNumber))) {
throw new IOException("Page number " + pageNumber + " already in usage map");
}
addOrRemovePageNumber(pageNumber, true);
}
/**
* Remove a page number from this usage map
*/
public void removePageNumber(int pageNumber) throws IOException {
addOrRemovePageNumber(pageNumber, false);
}
protected void updateMap(int absolutePageNumber, int relativePageNumber,
int bitmask, ByteBuffer buffer, boolean add)
{
//Find the byte to apply the bitmask to
int offset = relativePageNumber / 8;
byte b = buffer.get(_startOffset + offset);
//Apply the bitmask
if (add) {
b |= bitmask;
_pageNumbers.add(new Integer(absolutePageNumber));
} else {
b &= ~bitmask;
}
buffer.put(_startOffset + offset, b);
}
public String toString() {
return "page numbers: " + _pageNumbers;
}
/**
* @param pageNumber Page number to add or remove from this map
* @param add True to add it, false to remove it
*/
protected abstract void addOrRemovePageNumber(int pageNumber, boolean add) throws IOException;
}

+ 151
- 0
src/java/com/healthmarketscience/jackcess/scsu/Debug.java View File

@@ -0,0 +1,151 @@
package com.healthmarketscience.jackcess.scsu;

import org.apache.commons.logging.Log;
import org.apache.commons.logging.LogFactory;

/*
* This sample software accompanies Unicode Technical Report #6 and
* distributed as is by Unicode, Inc., subject to the following:
*
* Copyright © 1996-1997 Unicode, Inc.. All Rights Reserved.
*
* Permission to use, copy, modify, and distribute this software
* without fee is hereby granted provided that this copyright notice
* appears in all copies.
*
* UNICODE, INC. MAKES NO REPRESENTATIONS OR WARRANTIES ABOUT THE
* SUITABILITY OF THE SOFTWARE, EITHER EXPRESS OR IMPLIED, INCLUDING
* BUT NOT LIMITED TO THE IMPLIED WARRANTIES OF MERCHANTABILITY,
* FITNESS FOR A PARTICULAR PURPOSE, OR NON-INFRINGEMENT.
* UNICODE, INC., SHALL NOT BE LIABLE FOR ANY ERRORS OR OMISSIONS, AND
* SHALL NOT BE LIABLE FOR ANY DAMAGES, INCLUDING CONSEQUENTIAL AND
* INCIDENTAL DAMAGES, SUFFERED BY YOU AS A RESULT OF USING, MODIFYING
* OR DISTRIBUTING THIS SOFTWARE OR ITS DERIVATIVES.
*
* @author Asmus Freytag
*
* @version 001 Dec 25 1996
* @version 002 Jun 25 1997
* @version 003 Jul 25 1997
* @version 004 Aug 25 1997
*
* Unicode and the Unicode logo are trademarks of Unicode, Inc.,
* and are registered in some jurisdictions.
**/

/**
* A number of helpful output routines for debugging. Output can be
* centrally enabled or disabled by calling Debug.set(true/false);
* All methods are statics;
*/

public class Debug
{
private static final Log LOG = LogFactory.getLog(Debug.class);
// debugging helper
public static void out(char [] chars)
{
out(chars, 0);
}

public static void out(char [] chars, int iStart)
{
if (!LOG.isDebugEnabled()) return;
StringBuffer msg = new StringBuffer();

for (int i = iStart; i < chars.length; i++)
{
if (chars[i] >= 0 && chars[i] <= 26)
{
msg.append("^"+(char)(chars[i]+0x40));
}
else if (chars[i] <= 255)
{
msg.append(chars[i]);
}
else
{
msg.append("\\u"+Integer.toString(chars[i],16));
}
}
LOG.debug(msg.toString());
}

public static void out(byte [] bytes)
{
out(bytes, 0);
}
public static void out(byte [] bytes, int iStart)
{
if (!LOG.isDebugEnabled()) return;
StringBuffer msg = new StringBuffer();

for (int i = iStart; i < bytes.length; i++)
{
msg.append(bytes[i]+",");
}
LOG.debug(msg.toString());
}

public static void out(String str)
{
if (!LOG.isDebugEnabled()) return;

LOG.debug(str);
}

public static void out(String msg, int iData)
{
if (!LOG.isDebugEnabled()) return;

LOG.debug(msg + iData);
}
public static void out(String msg, char ch)
{
if (!LOG.isDebugEnabled()) return;

LOG.debug(msg + "[U+"+Integer.toString(ch,16)+"]" + ch);
}
public static void out(String msg, byte bData)
{
if (!LOG.isDebugEnabled()) return;

LOG.debug(msg + bData);
}
public static void out(String msg, String str)
{
if (!LOG.isDebugEnabled()) return;

LOG.debug(msg + str);
}
public static void out(String msg, char [] data)
{
if (!LOG.isDebugEnabled()) return;

LOG.debug(msg);
out(data);
}
public static void out(String msg, byte [] data)
{
if (!LOG.isDebugEnabled()) return;

LOG.debug(msg);
out(data);
}
public static void out(String msg, char [] data, int iStart)
{
if (!LOG.isDebugEnabled()) return;

LOG.debug(msg +"("+iStart+"): ");
out(data, iStart);
}
public static void out(String msg, byte [] data, int iStart)
{
if (!LOG.isDebugEnabled()) return;

LOG.debug(msg+"("+iStart+"): ");
out(data, iStart);
}
}

+ 46
- 0
src/java/com/healthmarketscience/jackcess/scsu/EndOfInputException.java View File

@@ -0,0 +1,46 @@
package com.healthmarketscience.jackcess.scsu;

/**
* This sample software accompanies Unicode Technical Report #6 and
* distributed as is by Unicode, Inc., subject to the following:
*
* Copyright © 1996-1997 Unicode, Inc.. All Rights Reserved.
*
* Permission to use, copy, modify, and distribute this software
* without fee is hereby granted provided that this copyright notice
* appears in all copies.
*
* UNICODE, INC. MAKES NO REPRESENTATIONS OR WARRANTIES ABOUT THE
* SUITABILITY OF THE SOFTWARE, EITHER EXPRESS OR IMPLIED, INCLUDING
* BUT NOT LIMITED TO THE IMPLIED WARRANTIES OF MERCHANTABILITY,
* FITNESS FOR A PARTICULAR PURPOSE, OR NON-INFRINGEMENT.
* UNICODE, INC., SHALL NOT BE LIABLE FOR ANY ERRORS OR OMISSIONS, AND
* SHALL NOT BE LIABLE FOR ANY DAMAGES, INCLUDING CONSEQUENTIAL AND
* INCIDENTAL DAMAGES, SUFFERED BY YOU AS A RESULT OF USING, MODIFYING
* OR DISTRIBUTING THIS SOFTWARE OR ITS DERIVATIVES.
*
* @author Asmus Freytag
*
* @version 001 Dec 25 1996
* @version 002 Jun 25 1997
* @version 003 Jul 25 1997
* @version 004 Aug 25 1997
*
* Unicode and the Unicode logo are trademarks of Unicode, Inc.,
* and are registered in some jurisdictions.
**/
/**
* The input string or input byte array ended prematurely
*
*/
public class EndOfInputException
extends java.lang.Exception
{
public EndOfInputException(){
super("The input string or input byte array ended prematurely");
}

public EndOfInputException(String s) {
super(s);
}
}

+ 429
- 0
src/java/com/healthmarketscience/jackcess/scsu/Expand.java View File

@@ -0,0 +1,429 @@
package com.healthmarketscience.jackcess.scsu;

/*
* This sample software accompanies Unicode Technical Report #6 and
* distributed as is by Unicode, Inc., subject to the following:
*
* Copyright © 1996-1998 Unicode, Inc.. All Rights Reserved.
*
* Permission to use, copy, modify, and distribute this software
* without fee is hereby granted provided that this copyright notice
* appears in all copies.
*
* UNICODE, INC. MAKES NO REPRESENTATIONS OR WARRANTIES ABOUT THE
* SUITABILITY OF THE SOFTWARE, EITHER EXPRESS OR IMPLIED, INCLUDING
* BUT NOT LIMITED TO THE IMPLIED WARRANTIES OF MERCHANTABILITY,
* FITNESS FOR A PARTICULAR PURPOSE, OR NON-INFRINGEMENT.
* UNICODE, INC., SHALL NOT BE LIABLE FOR ANY ERRORS OR OMISSIONS, AND
* SHALL NOT BE LIABLE FOR ANY DAMAGES, INCLUDING CONSEQUENTIAL AND
* INCIDENTAL DAMAGES, SUFFERED BY YOU AS A RESULT OF USING, MODIFYING
* OR DISTRIBUTING THIS SOFTWARE OR ITS DERIVATIVES.
*
* @author Asmus Freytag
*
* @version 001 Dec 25 1996
* @version 002 Jun 25 1997
* @version 003 Jul 25 1997
* @version 004 Aug 25 1997
* @version 005 Sep 30 1998
*
* Unicode and the Unicode logo are trademarks of Unicode, Inc.,
* and are registered in some jurisdictions.
**/

/**
Reference decoder for the Standard Compression Scheme for Unicode (SCSU)

<H2>Notes on the Java implementation</H2>

A limitation of Java is the exclusive use of a signed byte data type.
The following work arounds are required:

Copying a byte to an integer variable and adding 256 for 'negative'
bytes gives an integer in the range 0-255.

Values of char are between 0x0000 and 0xFFFF in Java. Arithmetic on
char values is unsigned.

Extended characters require an int to store them. The sign is not an
issue because only 1024*1024 + 65536 extended characters exist.

**/
public class Expand extends SCSU
{
/** (re-)define (and select) a dynamic window
A sliding window position cannot start at any Unicode value,
so rather than providing an absolute offset, this function takes
an index value which selects among the possible starting values.

Most scripts in Unicode start on or near a half-block boundary
so the default behaviour is to multiply the index by 0x80. Han,
Hangul, Surrogates and other scripts between 0x3400 and 0xDFFF
show very poor locality--therefore no sliding window can be set
there. A jumpOffset is added to the index value to skip that region,
and only 167 index values total are required to select all eligible
half-blocks.

Finally, a few scripts straddle half block boundaries. For them, a
table of fixed offsets is used, and the index values from 0xF9 to
0xFF are used to select these special offsets.

After (re-)defining a windows location it is selected so it is ready
for use.

Recall that all Windows are of the same length (128 code positions).

@param iWindow - index of the window to be (re-)defined
@param bOffset - index for the new offset value
**/
// @005 protected <-- private here and elsewhere
protected void defineWindow(int iWindow, byte bOffset)
throws IllegalInputException
{
int iOffset = (bOffset < 0 ? bOffset + 256 : bOffset);

// 0 is a reserved value
if (iOffset == 0)
{
throw new IllegalInputException();
}
else if (iOffset < gapThreshold)
{
dynamicOffset[iWindow] = iOffset << 7;
}
else if (iOffset < reservedStart)
{
dynamicOffset[iWindow] = (iOffset << 7) + gapOffset;
}
else if (iOffset < fixedThreshold)
{
// more reserved values
throw new IllegalInputException("iOffset == "+iOffset);
}
else
{
dynamicOffset[iWindow] = fixedOffset[iOffset - fixedThreshold];
}

// make the redefined window the active one
selectWindow(iWindow);
}

/** (re-)define (and select) a window as an extended dynamic window
The surrogate area in Unicode allows access to 2**20 codes beyond the
first 64K codes by combining one of 1024 characters from the High
Surrogate Area with one of 1024 characters from the Low Surrogate
Area (see Unicode 2.0 for the details).

The tags SDX and UDX set the window such that each subsequent byte in
the range 80 to FF represents a surrogate pair. The following diagram
shows how the bits in the two bytes following the SDX or UDX, and a
subsequent data byte, map onto the bits in the resulting surrogate pair.

hbyte lbyte data
nnnwwwww zzzzzyyy 1xxxxxxx

high-surrogate low-surrogate
110110wwwwwzzzzz 110111yyyxxxxxxx

@param chOffset - Since the three top bits of chOffset are not needed to
set the location of the extended Window, they are used instead
to select the window, thereby reducing the number of needed command codes.
The bottom 13 bits of chOffset are used to calculate the offset relative to
a 7 bit input data byte to yield the 20 bits expressed by each surrogate pair.
**/
protected void defineExtendedWindow(char chOffset)
{
// The top 3 bits of iOffsetHi are the window index
int iWindow = chOffset >>> 13;

// Calculate the new offset
dynamicOffset[iWindow] = ((chOffset & 0x1FFF) << 7) + (1 << 16);

// make the redefined window the active one
selectWindow(iWindow);
}

/** string buffer length used by the following functions */
protected int iOut = 0;

/** input cursor used by the following functions */
protected int iIn = 0;

/** expand input that is in Unicode mode
@param in input byte array to be expanded
@param iCur starting index
@param sb string buffer to which to append expanded input
@return the index for the lastc byte processed
**/
protected int expandUnicode(byte []in, int iCur, StringBuffer sb)
throws IllegalInputException, EndOfInputException
{
for( ; iCur < in.length-1; iCur+=2 ) // step by 2:
{
byte b = in[iCur];

if (b >= UC0 && b <= UC7)
{
Debug.out("SelectWindow: ", b);
selectWindow(b - UC0);
return iCur;
}
else if (b >= UD0 && b <= UD7)
{
defineWindow( b - UD0, in[iCur+1]);
return iCur + 1;
}
else if (b == UDX)
{
if( iCur >= in.length - 2)
{
break; // buffer error
}
defineExtendedWindow(charFromTwoBytes(in[iCur+1], in[iCur+2]));
return iCur + 2;
}
else if (b == UQU)
{
if( iCur >= in.length - 2)
{
break; // error
}
// Skip command byte and output Unicode character
iCur++;
}

// output a Unicode character
char ch = charFromTwoBytes(in[iCur], in[iCur+1]);
sb.append((char)ch);
iOut++;
}

if( iCur == in.length)
{
return iCur;
}

// Error condition
throw new EndOfInputException();
}

/** assemble a char from two bytes
In Java bytes are signed quantities, while chars are unsigned
@return the character
@param hi most significant byte
@param lo least significant byte
*/
public static char charFromTwoBytes(byte hi, byte lo)
{
char ch = (char)(lo >= 0 ? lo : 256 + lo);
return (char)(ch + (char)((hi >= 0 ? hi : 256 + hi)<<8));
}

/** expand portion of the input that is in single byte mode **/
protected String expandSingleByte(byte []in)
throws IllegalInputException, EndOfInputException
{

/* Allocate the output buffer. Because of control codes, generally
each byte of input results in fewer than one character of
output. Using in.length as an intial allocation length should avoid
the need to reallocate in mid-stream. The exception to this rule are
surrogates. */
StringBuffer sb = new StringBuffer(in.length);
iOut = 0;

// Loop until all input is exhausted or an error occurred
int iCur;
Loop:
for( iCur = 0; iCur < in.length; iCur++ )
{
// DEBUG Debug.out("Expanding: ", iCur);

// Default behaviour is that ASCII characters are passed through
// (staticOffset[0] == 0) and characters with the high bit on are
// offset by the current dynamic (or sliding) window (this.iWindow)
int iStaticWindow = 0;
int iDynamicWindow = getCurrentWindow();

switch(in[iCur])
{
// Quote from a static Window
case SQ0:
case SQ1:
case SQ2:
case SQ3:
case SQ4:
case SQ5:
case SQ6:
case SQ7:
Debug.out("SQn:", iStaticWindow);
// skip the command byte and check for length
if( iCur >= in.length - 1)
{
Debug.out("SQn missing argument: ", in, iCur);
break Loop; // buffer length error
}
// Select window pair to quote from
iDynamicWindow = iStaticWindow = in[iCur] - SQ0;
iCur ++;

// FALL THROUGH

default:
// output as character
if(in[iCur] >= 0)
{
// use static window
int ch = in[iCur] + staticOffset[iStaticWindow];
sb.append((char)ch);
iOut++;
}
else
{
// use dynamic window
int ch = (in[iCur] + 256); // adjust for signed bytes
ch -= 0x80; // reduce to range 00..7F
ch += dynamicOffset[iDynamicWindow];

//DEBUG
Debug.out("Dynamic: ", (char) ch);

if (ch < 1<<16)
{
// in Unicode range, output directly
sb.append((char)ch);
iOut++;
}
else
{
// this is an extension character
Debug.out("Extension character: ", ch);

// compute and append the two surrogates:
// translate from 10000..10FFFF to 0..FFFFF
ch -= 0x10000;

// high surrogate = top 10 bits added to D800
sb.append((char)(0xD800 + (ch>>10)));
iOut++;

// low surrogate = bottom 10 bits added to DC00
sb.append((char)(0xDC00 + (ch & ~0xFC00)));
iOut++;
}
}
break;

// define a dynamic window as extended
case SDX:
iCur += 2;
if( iCur >= in.length)
{
Debug.out("SDn missing argument: ", in, iCur -1);
break Loop; // buffer length error
}
defineExtendedWindow(charFromTwoBytes(in[iCur-1], in[iCur]));
break;

// Position a dynamic Window
case SD0:
case SD1:
case SD2:
case SD3:
case SD4:
case SD5:
case SD6:
case SD7:
iCur ++;
if( iCur >= in.length)
{
Debug.out("SDn missing argument: ", in, iCur -1);
break Loop; // buffer length error
}
defineWindow(in[iCur-1] - SD0, in[iCur]);
break;

// Select a new dynamic Window
case SC0:
case SC1:
case SC2:
case SC3:
case SC4:
case SC5:
case SC6:
case SC7:
selectWindow(in[iCur] - SC0);
break;
case SCU:
// switch to Unicode mode and continue parsing
iCur = expandUnicode(in, iCur+1, sb);
// DEBUG Debug.out("Expanded Unicode range until: ", iCur);
break;

case SQU:
// directly extract one Unicode character
iCur += 2;
if( iCur >= in.length)
{
Debug.out("SQU missing argument: ", in, iCur - 2);
break Loop; // buffer length error
}
else
{
char ch = charFromTwoBytes(in[iCur-1], in[iCur]);

Debug.out("Quoted: ", ch);
sb.append((char)ch);
iOut++;
}
break;

case Srs:
throw new IllegalInputException();
// break;
}
}

if( iCur >= in.length)
{
//SUCCESS: all input used up
sb.setLength(iOut);
iIn = iCur;
return sb.toString();
}

Debug.out("Length ==" + in.length+" iCur =", iCur);
//ERROR: premature end of input
throw new EndOfInputException();
}

/** expand a byte array containing compressed Unicode */
public String expand (byte []in)
throws IllegalInputException, EndOfInputException
{
String str = expandSingleByte(in);
Debug.out("expand output: ", str.toCharArray());
return str;
}


/** reset is called to start with new input, w/o creating a new
instance */
public void reset()
{
iOut = 0;
iIn = 0;
super.reset();
}

public int charsWritten()
{
return iOut;
}

public int bytesRead()
{
return iIn;
}
}

+ 45
- 0
src/java/com/healthmarketscience/jackcess/scsu/IllegalInputException.java View File

@@ -0,0 +1,45 @@
package com.healthmarketscience.jackcess.scsu;

/**
* This sample software accompanies Unicode Technical Report #6 and
* distributed as is by Unicode, Inc., subject to the following:
*
* Copyright © 1996-1997 Unicode, Inc.. All Rights Reserved.
*
* Permission to use, copy, modify, and distribute this software
* without fee is hereby granted provided that this copyright notice
* appears in all copies.
*
* UNICODE, INC. MAKES NO REPRESENTATIONS OR WARRANTIES ABOUT THE
* SUITABILITY OF THE SOFTWARE, EITHER EXPRESS OR IMPLIED, INCLUDING
* BUT NOT LIMITED TO THE IMPLIED WARRANTIES OF MERCHANTABILITY,
* FITNESS FOR A PARTICULAR PURPOSE, OR NON-INFRINGEMENT.
* UNICODE, INC., SHALL NOT BE LIABLE FOR ANY ERRORS OR OMISSIONS, AND
* SHALL NOT BE LIABLE FOR ANY DAMAGES, INCLUDING CONSEQUENTIAL AND
* INCIDENTAL DAMAGES, SUFFERED BY YOU AS A RESULT OF USING, MODIFYING
* OR DISTRIBUTING THIS SOFTWARE OR ITS DERIVATIVES.
*
* @author Asmus Freytag
*
* @version 001 Dec 25 1996
* @version 002 Jun 25 1997
* @version 003 Jul 25 1997
* @version 004 Aug 25 1997
*
* Unicode and the Unicode logo are trademarks of Unicode, Inc.,
* and are registered in some jurisdictions.
**/
/**
* The input character array or input byte array contained
* illegal sequences of bytes or characters
*/
public class IllegalInputException extends java.lang.Exception
{
public IllegalInputException(){
super("The input character array or input byte array contained illegal sequences of bytes or characters");
}

public IllegalInputException(String s) {
super(s);
}
}

+ 252
- 0
src/java/com/healthmarketscience/jackcess/scsu/SCSU.java View File

@@ -0,0 +1,252 @@
package com.healthmarketscience.jackcess.scsu;

/*
* This sample software accompanies Unicode Technical Report #6 and
* distributed as is by Unicode, Inc., subject to the following:
*
* Copyright © 1996-1998 Unicode, Inc.. All Rights Reserved.
*
* Permission to use, copy, modify, and distribute this software
* without fee is hereby granted provided that this copyright notice
* appears in all copies.
*
* UNICODE, INC. MAKES NO REPRESENTATIONS OR WARRANTIES ABOUT THE
* SUITABILITY OF THE SOFTWARE, EITHER EXPRESS OR IMPLIED, INCLUDING
* BUT NOT LIMITED TO THE IMPLIED WARRANTIES OF MERCHANTABILITY,
* FITNESS FOR A PARTICULAR PURPOSE, OR NON-INFRINGEMENT.
* UNICODE, INC., SHALL NOT BE LIABLE FOR ANY ERRORS OR OMISSIONS, AND
* SHALL NOT BE LIABLE FOR ANY DAMAGES, INCLUDING CONSEQUENTIAL AND
* INCIDENTAL DAMAGES, SUFFERED BY YOU AS A RESULT OF USING, MODIFYING
* OR DISTRIBUTING THIS SOFTWARE OR ITS DERIVATIVES.
*
* @author Asmus Freytag
*
* @version 001 Dec 25 1996
* @version 002 Jun 25 1997
* @version 003 Jul 25 1997
* @version 004 Aug 25 1997
* @version 005 Sep 30 1998
*
* Unicode and the Unicode logo are trademarks of Unicode, Inc.,
* and are registered in some jurisdictions.
**/

/**
Encoding text data in Unicode often requires more storage than using
an existing 8-bit character set and limited to the subset of characters
actually found in the text. The Unicode Compression Algorithm reduces
the necessary storage while retaining the universality of Unicode.
A full description of the algorithm can be found in document
http://www.unicode.org/unicode/reports/tr6.html

Summary

The goal of the Unicode Compression Algorithm is the abilty to
* Express all code points in Unicode
* Approximate storage size for traditional character sets
* Work well for short strings
* Provide transparency for Latin-1 data
* Support very simple decoders
* Support simple as well as sophisticated encoders

If needed, further compression can be achieved by layering standard
file or disk-block based compression algorithms on top.

<H2>Features</H2>

Languages using small alphabets would contain runs of characters that
are coded close together in Unicode. These runs are interrupted only
by punctuation characters, which are themselves coded in proximity to
each other in Unicode (usually in the ASCII range).

Two basic mechanisms in the compression algorithm account for these two
cases, sliding windows and static windows. A window is an area of 128
consecutive characters in Unicode. In the compressed data stream, each
character from a sliding window would be represented as a byte between
0x80 and 0xFF, while a byte from 0x20 to 0x7F (as well as CR, LF, and
TAB) would always mean an ASCII character (or control).

<H2>Notes on the Java implementation</H2>

A limitation of Java is the exclusive use of a signed byte data type.
The following work arounds are required:

Copying a byte to an integer variable and adding 256 for 'negative'
bytes gives an integer in the range 0-255.

Values of char are between 0x0000 and 0xFFFF in Java. Arithmetic on
char values is unsigned.

Extended characters require an int to store them. The sign is not an
issue because only 1024*1024 + 65536 extended characters exist.

**/
public abstract class SCSU
{
/** Single Byte mode command values */

/** SQ<i>n</i> Quote from Window . <p>
If the following byte is less than 0x80, quote from
static window <i>n</i>, else quote from dynamic window <i>n</i>.
*/

static final byte SQ0 = 0x01; // Quote from window pair 0
static final byte SQ1 = 0x02; // Quote from window pair 1
static final byte SQ2 = 0x03; // Quote from window pair 2
static final byte SQ3 = 0x04; // Quote from window pair 3
static final byte SQ4 = 0x05; // Quote from window pair 4
static final byte SQ5 = 0x06; // Quote from window pair 5
static final byte SQ6 = 0x07; // Quote from window pair 6
static final byte SQ7 = 0x08; // Quote from window pair 7

static final byte SDX = 0x0B; // Define a window as extended
static final byte Srs = 0x0C; // reserved

static final byte SQU = 0x0E; // Quote a single Unicode character
static final byte SCU = 0x0F; // Change to Unicode mode

/** SC<i>n</i> Change to Window <i>n</i>. <p>
If the following bytes are less than 0x80, interpret them
as command bytes or pass them through, else add the offset
for dynamic window <i>n</i>. */
static final byte SC0 = 0x10; // Select window 0
static final byte SC1 = 0x11; // Select window 1
static final byte SC2 = 0x12; // Select window 2
static final byte SC3 = 0x13; // Select window 3
static final byte SC4 = 0x14; // Select window 4
static final byte SC5 = 0x15; // Select window 5
static final byte SC6 = 0x16; // Select window 6
static final byte SC7 = 0x17; // Select window 7
static final byte SD0 = 0x18; // Define and select window 0
static final byte SD1 = 0x19; // Define and select window 1
static final byte SD2 = 0x1A; // Define and select window 2
static final byte SD3 = 0x1B; // Define and select window 3
static final byte SD4 = 0x1C; // Define and select window 4
static final byte SD5 = 0x1D; // Define and select window 5
static final byte SD6 = 0x1E; // Define and select window 6
static final byte SD7 = 0x1F; // Define and select window 7

static final byte UC0 = (byte) 0xE0; // Select window 0
static final byte UC1 = (byte) 0xE1; // Select window 1
static final byte UC2 = (byte) 0xE2; // Select window 2
static final byte UC3 = (byte) 0xE3; // Select window 3
static final byte UC4 = (byte) 0xE4; // Select window 4
static final byte UC5 = (byte) 0xE5; // Select window 5
static final byte UC6 = (byte) 0xE6; // Select window 6
static final byte UC7 = (byte) 0xE7; // Select window 7
static final byte UD0 = (byte) 0xE8; // Define and select window 0
static final byte UD1 = (byte) 0xE9; // Define and select window 1
static final byte UD2 = (byte) 0xEA; // Define and select window 2
static final byte UD3 = (byte) 0xEB; // Define and select window 3
static final byte UD4 = (byte) 0xEC; // Define and select window 4
static final byte UD5 = (byte) 0xED; // Define and select window 5
static final byte UD6 = (byte) 0xEE; // Define and select window 6
static final byte UD7 = (byte) 0xEF; // Define and select window 7

static final byte UQU = (byte) 0xF0; // Quote a single Unicode character
static final byte UDX = (byte) 0xF1; // Define a Window as extended
static final byte Urs = (byte) 0xF2; // reserved

/** constant offsets for the 8 static windows */
static final int staticOffset[] =
{
0x0000, // ASCII for quoted tags
0x0080, // Latin - 1 Supplement (for access to punctuation)
0x0100, // Latin Extended-A
0x0300, // Combining Diacritical Marks
0x2000, // General Punctuation
0x2080, // Currency Symbols
0x2100, // Letterlike Symbols and Number Forms
0x3000 // CJK Symbols and punctuation
};

/** initial offsets for the 8 dynamic (sliding) windows */
static final int initialDynamicOffset[] =
{
0x0080, // Latin-1
0x00C0, // Latin Extended A //@005 fixed from 0x0100
0x0400, // Cyrillic
0x0600, // Arabic
0x0900, // Devanagari
0x3040, // Hiragana
0x30A0, // Katakana
0xFF00 // Fullwidth ASCII
};

/** dynamic window offsets, intitialize to default values. */
int dynamicOffset[] =
{
initialDynamicOffset[0],
initialDynamicOffset[1],
initialDynamicOffset[2],
initialDynamicOffset[3],
initialDynamicOffset[4],
initialDynamicOffset[5],
initialDynamicOffset[6],
initialDynamicOffset[7]
};

// The following method is common to encoder and decoder

private int iWindow = 0; // current active window

/** select the active dynamic window **/
protected void selectWindow(int iWindow)
{
this.iWindow = iWindow;
}

/** select the active dynamic window **/
protected int getCurrentWindow()
{
return this.iWindow;
}

/**
These values are used in defineWindow
**/

/**
* Unicode code points from 3400 to E000 are not adressible by
* dynamic window, since in these areas no short run alphabets are
* found. Therefore add gapOffset to all values from gapThreshold */
static final int gapThreshold = 0x68;
static final int gapOffset = 0xAC00;

/* values between reservedStart and fixedThreshold are reserved */
static final int reservedStart = 0xA8;

/* use table of predefined fixed offsets for values from fixedThreshold */
static final int fixedThreshold = 0xF9;

/** Table of fixed predefined Offsets, and byte values that index into **/
static final int fixedOffset[] =
{
/* 0xF9 */ 0x00C0, // Latin-1 Letters + half of Latin Extended A
/* 0xFA */ 0x0250, // IPA extensions
/* 0xFB */ 0x0370, // Greek
/* 0xFC */ 0x0530, // Armenian
/* 0xFD */ 0x3040, // Hiragana
/* 0xFE */ 0x30A0, // Katakana
/* 0xFF */ 0xFF60 // Halfwidth Katakana
};

/** whether a character is compressible */
public static boolean isCompressible(char ch)
{
return (ch < 0x3400 || ch >= 0xE000);
}

/** reset is only needed to bail out after an exception and
restart with new input */
public void reset()
{

// reset the dynamic windows
for (int i = 0; i < dynamicOffset.length; i++)
{
dynamicOffset[i] = initialDynamicOffset[i];
}
this.iWindow = 0;
}
}

BIN
src/resources/com/healthmarketscience/jackcess/empty.mdb View File


+ 6
- 0
src/resources/com/healthmarketscience/jackcess/log4j.properties View File

@@ -0,0 +1,6 @@
log4j.rootCategory=INFO, stdout
log4j.appender.stdout=org.apache.log4j.ConsoleAppender
log4j.appender.stdout.layout=org.apache.log4j.PatternLayout
log4j.appender.stdout.layout.ConversionPattern=**** %-5p %d{MMM d HH:mm:ss} [%F] - %m%n

log4j.category.com.healthmarketscience.jackcess=INFO

+ 1
- 0
test/data/sample-input-only-headers.tab View File

@@ -0,0 +1 @@
RESULT_PHYS_ID FIRST MIDDLE LAST OUTLIER RANK CLAIM_COUNT PROCEDURE_COUNT WEIGHTED_CLAIM_COUNT WEIGHTED_PROCEDURE_COUNT

+ 3
- 0
test/data/sample-input.tab View File

@@ -0,0 +1,3 @@
Test1 Test2 Test3
Foo Bar Ralph
S Mouse Rocks

BIN
test/data/test.mdb View File


+ 203
- 0
test/src/java/com/healthmarketscience/jackcess/DatabaseTest.java View File

@@ -0,0 +1,203 @@
// Copyright (c) 2004 Health Market Science, Inc.

package com.healthmarketscience.jackcess;

import java.io.File;
import java.util.ArrayList;
import java.util.Calendar;
import java.util.Date;
import java.util.List;
import java.util.Map;

import com.healthmarketscience.jackcess.Column;
import com.healthmarketscience.jackcess.DataTypes;
import com.healthmarketscience.jackcess.Database;
import com.healthmarketscience.jackcess.Table;

import junit.framework.TestCase;

/**
* @author Tim McCune
*/
public class DatabaseTest extends TestCase {
public DatabaseTest(String name) throws Exception {
super(name);
}
private Database open() throws Exception {
return Database.open(new File("test/data/test.mdb"));
}
private Database create() throws Exception {
File tmp = File.createTempFile("databaseTest", ".mdb");
tmp.deleteOnExit();
return Database.create(tmp);
}
public void testGetColumns() throws Exception {
List columns = open().getTable("Table1").getColumns();
assertEquals(9, columns.size());
checkColumn(columns, 0, "A", DataTypes.TEXT);
checkColumn(columns, 1, "B", DataTypes.TEXT);
checkColumn(columns, 2, "C", DataTypes.BYTE);
checkColumn(columns, 3, "D", DataTypes.INT);
checkColumn(columns, 4, "E", DataTypes.LONG);
checkColumn(columns, 5, "F", DataTypes.DOUBLE);
checkColumn(columns, 6, "G", DataTypes.SHORT_DATE_TIME);
checkColumn(columns, 7, "H", DataTypes.MONEY);
checkColumn(columns, 8, "I", DataTypes.BOOLEAN);
}
private void checkColumn(List columns, int columnNumber, String name, byte dataType)
throws Exception {
Column column = (Column) columns.get(columnNumber);
assertEquals(name, column.getName());
assertEquals(dataType, column.getType());
}
public void testGetNextRow() throws Exception {
Database db = open();
assertEquals(1, db.getTableNames().size());
Table table = db.getTable("Table1");
Map row = table.getNextRow();
assertEquals("abcdefg", row.get("A"));
assertEquals("hijklmnop", row.get("B"));
assertEquals(new Byte((byte) 2), row.get("C"));
assertEquals(new Short((short) 222), row.get("D"));
assertEquals(new Integer(333333333), row.get("E"));
assertEquals(new Double(444.555d), row.get("F"));
Calendar cal = Calendar.getInstance();
cal.setTime((Date) row.get("G"));
assertEquals(Calendar.SEPTEMBER, cal.get(Calendar.MONTH));
assertEquals(21, cal.get(Calendar.DAY_OF_MONTH));
assertEquals(1974, cal.get(Calendar.YEAR));
assertEquals(Boolean.TRUE, row.get("I"));
row = table.getNextRow();
assertEquals("a", row.get("A"));
assertEquals("b", row.get("B"));
assertEquals(new Byte((byte) 0), row.get("C"));
assertEquals(new Short((short) 0), row.get("D"));
assertEquals(new Integer(0), row.get("E"));
assertEquals(new Double(0d), row.get("F"));
cal = Calendar.getInstance();
cal.setTime((Date) row.get("G"));
assertEquals(Calendar.DECEMBER, cal.get(Calendar.MONTH));
assertEquals(12, cal.get(Calendar.DAY_OF_MONTH));
assertEquals(1981, cal.get(Calendar.YEAR));
assertEquals(Boolean.FALSE, row.get("I"));
}
public void testCreate() throws Exception {
Database db = create();
assertEquals(0, db.getTableNames().size());
}
public void testWriteAndRead() throws Exception {
Database db = create();
createTestTable(db);
Object[] row = new Object[9];
row[0] = "Tim";
row[1] = "R";
row[2] = "McCune";
row[3] = new Integer(1234);
row[4] = new Byte((byte) 0xad);
row[5] = new Double(555.66d);
row[6] = new Float(777.88d);
row[7] = new Short((short) 999);
row[8] = new Date();
Table table = db.getTable("Test");
int count = 1000;
for (int i = 0; i < count; i++) {
table.addRow(row);
}
for (int i = 0; i < count; i++) {
Map readRow = table.getNextRow();
assertEquals(row[0], readRow.get("A"));
assertEquals(row[1], readRow.get("B"));
assertEquals(row[2], readRow.get("C"));
assertEquals(row[3], readRow.get("D"));
assertEquals(row[4], readRow.get("E"));
assertEquals(row[5], readRow.get("F"));
assertEquals(row[6], readRow.get("G"));
assertEquals(row[7], readRow.get("H"));
}
}
public void testWriteAndReadInBatch() throws Exception {
Database db = create();
createTestTable(db);
int count = 1000;
List rows = new ArrayList(count);
Object[] row = new Object[9];
row[0] = "Tim";
row[1] = "R";
row[2] = "McCune";
row[3] = new Integer(1234);
row[4] = new Byte((byte) 0xad);
row[5] = new Double(555.66d);
row[6] = new Float(777.88d);
row[7] = new Short((short) 999);
row[8] = new Date();
for (int i = 0; i < count; i++) {
rows.add(row);
}
Table table = db.getTable("Test");
table.addRows(rows);
for (int i = 0; i < count; i++) {
Map readRow = table.getNextRow();
assertEquals(row[0], readRow.get("A"));
assertEquals(row[1], readRow.get("B"));
assertEquals(row[2], readRow.get("C"));
assertEquals(row[3], readRow.get("D"));
assertEquals(row[4], readRow.get("E"));
assertEquals(row[5], readRow.get("F"));
assertEquals(row[6], readRow.get("G"));
assertEquals(row[7], readRow.get("H"));
}
}
private void createTestTable(Database db) throws Exception {
List columns = new ArrayList();
Column col = new Column();
col.setName("A");
col.setType(DataTypes.TEXT);
columns.add(col);
col = new Column();
col.setName("B");
col.setType(DataTypes.TEXT);
columns.add(col);
col = new Column();
col.setName("C");
col.setType(DataTypes.TEXT);
columns.add(col);
col = new Column();
col.setName("D");
col.setType(DataTypes.LONG);
columns.add(col);
col = new Column();
col.setName("E");
col.setType(DataTypes.BYTE);
columns.add(col);
col = new Column();
col.setName("F");
col.setType(DataTypes.DOUBLE);
columns.add(col);
col = new Column();
col.setName("G");
col.setType(DataTypes.FLOAT);
columns.add(col);
col = new Column();
col.setName("H");
col.setType(DataTypes.INT);
columns.add(col);
col = new Column();
col.setName("I");
col.setType(DataTypes.SHORT_DATE_TIME);
columns.add(col);
db.createTable("test", columns);
}
}

+ 45
- 0
test/src/java/com/healthmarketscience/jackcess/ImportTest.java View File

@@ -0,0 +1,45 @@
// Copyright (c) 2004 Health Market Science, Inc.

package com.healthmarketscience.jackcess;

import org.apache.commons.logging.Log;
import org.apache.commons.logging.LogFactory;

import com.healthmarketscience.jackcess.Database;

import java.io.File;
import junit.framework.TestCase;

/**
* @author Rob Di Marco
*/
public class ImportTest extends TestCase
{

/** The logger to use. */
private static final Log LOG = LogFactory.getLog(ImportTest.class);
public ImportTest(String name)
{
super(name);
}

private Database create() throws Exception {
File tmp = File.createTempFile("databaseTest", ".mdb");
tmp.deleteOnExit();
return Database.create(tmp);
}

public void testImportFromFile() throws Exception
{
Database db = create();
db.importFile("test", new File("test/data/sample-input.tab"), "\\t");
}

public void testImportFromFileWithOnlyHeaders() throws Exception
{
Database db = create();
db.importFile("test", new File("test/data/sample-input-only-headers.tab"),
"\\t");
}

}

+ 51
- 0
test/src/java/com/healthmarketscience/jackcess/TableTest.java View File

@@ -0,0 +1,51 @@
// Copyright (c) 2004 Health Market Science, Inc.

package com.healthmarketscience.jackcess;

import java.nio.ByteBuffer;
import java.util.ArrayList;
import java.util.List;

import com.healthmarketscience.jackcess.Column;
import com.healthmarketscience.jackcess.DataTypes;
import com.healthmarketscience.jackcess.Table;

import junit.framework.TestCase;

/**
* @author Tim McCune
*/
public class TableTest extends TestCase {
public TableTest(String name) {
super(name);
}
public void testCreateRow() throws Exception {
Table table = new Table();
List columns = new ArrayList();
Column col = new Column();
col.setType(DataTypes.INT);
columns.add(col);
col = new Column();
col.setType(DataTypes.TEXT);
columns.add(col);
columns.add(col);
table.setColumns(columns);
int colCount = 3;
Object[] row = new Object[colCount];
row[0] = new Short((short) 9);
row[1] = "Tim";
row[2] = "McCune";
ByteBuffer buffer = table.createRow(row);
assertEquals((short) colCount, buffer.getShort());
assertEquals((short) 9, buffer.getShort());
assertEquals((byte) 'T', buffer.get());
assertEquals((short) 22, buffer.getShort(22));
assertEquals((short) 10, buffer.getShort(24));
assertEquals((short) 4, buffer.getShort(26));
assertEquals((short) 2, buffer.getShort(28));
assertEquals((byte) 7, buffer.get(30));
}
}

+ 110
- 0
xdocs/faq.fml View File

@@ -0,0 +1,110 @@
<?xml version="1.0"?>

<faqs title="Frequently Asked Questions">

<part id="general">
<title>General</title>

<faq id="linux">
<question>Does this work on Linux/Unix?</question>
<answer>
<p>Yep, Jackcess is pure Java. It will work on any
Java Virtual Machine (1.4+).</p>
</answer>
</faq>

<faq id="formats">
<question>What Access formats does it support?</question>
<answer>
<p>Jackcess currently supports <i>only</i> Access 2000
databases. Access 2003 is not supported.</p>
</answer>
</faq>

<faq id="mdbtools">
<question>
How is this different from
<a href="http://mdbtools.sf.net">mdbtools</a>?
</question>
<answer>
<p>
We want to give a lot of credit to mdbtools. They have
been around much longer than Jackcess, and, along with
<a href="http://jakarta.apache.org/poi">POI</a>,
inspired us that a project like this could be done.
mdbtools is written in C. There is a Java port of it,
but if you've ever read or used a Java port of a C
library, you can appreciate the difference between such
a library and one written from scratch in Java.
</p>
<p>
At the time of this writing, mdbtools could only read
Access databases. Jackcess can also write to them.
According to their web site, "Write support is currently being
worked on and the first cut is expected to be included in the
0.6 release." This status hasn't changed since we first
started work on Jackcess.
</p>
<p>
mdbtools supports Access 97 databases, which Jackcess does not.
The Java port of mdbtools also includes an implementation of
a small subset of the JDBC APIs. Jackcess does not currently,
but a pure Java JDBC driver for Access could certainly be written
on top of Jackcess.
</p>
</answer>
</faq>

<faq id="poi">
<question>
This looks like a logical addition to
<a href="http://jakarta.apache.org/poi">POI</a>. Why not integrate
with that project?
</question>
<answer>
<p>
POI is released under
<a href="http://www.apache.org/foundation/licence-FAQ.html">The Apache License</a>.
Jackcess is released under
<a href="http://www.gnu.org/copyleft/lesser.html">The GNU Lesser General Public License</a>.
The Apache license allows closed-source and/or commercial forks.
The LGPL does not. If you change or enhance Jackcess, you must contribute
your changes back to the project.
</p>
</answer>
</faq>

<faq id="hms">
<question>Who is Health Market Science?</question>
<answer>
<p>
HMS is a small company located in suburban Philadelphia.
Using proprietary matching and consolidation software,
HMS scientifically manufactures the most comprehensive
and accurate healthcare data sets in the market today.
<a href="http://www.healthmarketscience.com/careers.htm">We're hiring!</a>
HMS is always looking for talented individuals, especially
<a href="http://www.healthmarketscience.com/hr_web/active/hms_software_developer.htm">Java developers</a>.
</p>
</answer>
</faq>

<faq id="bugs">
<question>It doesn't work!</question>
<answer>
<p>
Ok, that wasn't a question, but we'll try to respond anyway. :)
Jackcess is young, and not that robust yet. As you might imagine,
it's kind of hard to test, simply by its nature. There are
bugs that we are aware of, and certainly many more that we are not.
If you find what looks like a bug, please
<a href="http://sf.net/tracker/?group_id=134943&amp;atid=731445">report it.</a>
Even better, fix it, and
<a href="http://sf.net/tracker/?group_id=134943&amp;atid=731447">submit a patch.</a>
</p>
</answer>
</faq>

</part>

</faqs>

+ 47
- 0
xdocs/index.xml View File

@@ -0,0 +1,47 @@
<?xml version="1.0"?>

<document>
<properties>
<author email="javajedi@users.sf.net">Tim McCune</author>
</properties>
<body>
<section name="Jackcess">
<p>
Jackcess ia a pure Java library for reading from and
writing to MS Access databases. It is not an application.
There is no GUI. It's a library, intended for other
developers to use to build Java applications. Take a look
at our <a href="faq.html">Frequently Asked Questions</a>
for more info.
</p>
</section>
<section name="Sample code">
<p>
<ul>
<li>Displaying the contents of a table:
<pre>Database.open(new File("my.mdb")).getTable("MyTable").display();</pre>
</li>
<li>Creating a new table and writing data into it:
<pre>Database db = Database.create(new File("new.mdb"));
Column a = new Column();
a.setName("a");
a.setSQLType(Types.INTEGER);
Column b = new Column();
b.setName("b");
b.setSQLType(Types.VARCHAR);
db.createTable("NewTable", Arrays.asList(a, b));
Table newTable = db.getTable("NewTable");
newTable.addRow(new Object[] {1, "foo"});</pre>
</li>
<li>Copying the contents of a JDBC ResultSet (e.g. from an
external database) into a new table:
<pre>Database.open(new File("my.mdb")).copyTable("Imported", resultSet);</pre>
</li>
<li>Copying the contents of a CSV file into a new table:
<pre>Database.open(new File("my.mdb")).importFile("Imported2", new File("my.csv"), ",");</pre>
</li>
</ul>
</p>
</section>
</body>
</document>

Loading…
Cancel
Save