This article aims to help Java developers build desktop and server-side Java applications to detect machine-readable zones (MRZ) in passports, travel documents, and ID cards. You will see how to encapsulate Dynamsoft C++ OCR SDK into a Java Jar package and how to quickly create a command-line MRZ detector with a few lines of Java code.
Java Classes and Methods for MRZ Detection
Create NativeLabelRecognizer.java
, NativeLoader.java
, MrzResult.java
and MrzParser.java
.
-
NativeLabelRecognizer.java
is a wrapper class for the native library. It loads the native library and calls the native methods. The primary native methods defined inNativeLabelRecognizer
are as follows:
public NativeLabelRecognizer() { nativePtr = nativeCreateInstance(); } public void destroyInstance() { if (nativePtr != 0) nativeDestroyInstance(nativePtr); } public static int setLicense(String license) { return nativeInitLicense(license); } public ArrayList<MrzResult> detectFile(String fileName) { return nativeDetectFile(nativePtr, fileName); } public String getVersion() { return nativeGetVersion(); } public int loadModel() throws IOException { ... return nativeLoadModel(nativePtr, targetPath); } private native static int nativeInitLicense(String license); private native long nativeCreateInstance(); private native void nativeDestroyInstance(long nativePtr); private native ArrayList<MrzResult> nativeDetectFile(long nativePtr, String fileName); private native int nativeLoadModel(long nativePtr, String modelPath);
The
loadModel()
method is special. It needs to dynamically update the model path
specified in the JSON-formatted template file according to the extraction path of the Jar package. Gson can be used to load and update the JSON object.
public int loadModel() throws IOException { String modeFile = "MRZ.json"; String tempFolder = new File(System.getProperty("java.io.tmpdir")).getAbsolutePath(); String targetPath = new File(tempFolder, modeFile).getAbsolutePath(); // Modify the model path based on your own environment FileReader reader = new FileReader(targetPath); char[] chars = new char[1024]; int len = 0; StringBuilder sb = new StringBuilder(); while ((len = reader.read(chars)) != -1) { sb.append(new String(chars, 0, len)); } String template = sb.toString(); if (reader != null) { reader.close(); } Gson gson = new Gson(); JsonObject jsonObject = gson.fromJson(template, JsonObject.class); JsonArray array = jsonObject.get("CharacterModelArray").getAsJsonArray(); JsonObject object = array.get(0).getAsJsonObject(); String modelPath = object.get("DirectoryPath").getAsString(); if (modelPath != null && modelPath.contains("model")) { object.addProperty("DirectoryPath", tempFolder); } FileWriter writer = new FileWriter(targetPath); writer.write(jsonObject.toString()); writer.flush(); writer.close(); return nativeLoadModel(nativePtr, targetPath); }
-
NativeLoader.java
is a utility class to extract MRZ OCR model files and C++ shared library files from the Jar package, as well as load the native libraries. All assets will be extracted to the temporary directory of users' operating system. MD5 checksum is used to compare the file changes.
private static boolean extractResourceFiles(String dlrNativeLibraryPath, String dlrNativeLibraryName, String tempFolder) throws IOException { String[] filenames = null; if (Utils.isWindows()) { filenames = new String[] {"DynamsoftLicenseClientx64.dll", "vcomp140.dll", "DynamicPdfx64.dll", "DynamsoftLabelRecognizerx64.dll", "dlr.dll"}; } else if (Utils.isLinux()) { filenames = new String[] {"libDynamicPdf.so", "libDynamsoftLicenseClient.so", "libDynamsoftLabelRecognizer.so", "libdlr.so"}; } boolean ret = true; for (String file : filenames) { ret &= extractAndLoadLibraryFile(dlrNativeLibraryPath, file, tempFolder); } // Extract model files String modelPath = "/model"; filenames = new String[] {"MRZ.json", "MRZ.caffemodel", "MRZ.txt", "MRZ.prototxt"}; for (String file : filenames) { ret &= extractAndLoadLibraryFile(modelPath, file, tempFolder); } return ret; } static String md5sum(InputStream input) throws IOException { BufferedInputStream in = new BufferedInputStream(input); try { MessageDigest digest = java.security.MessageDigest.getInstance("MD5"); DigestInputStream digestInputStream = new DigestInputStream(in, digest); for (; digestInputStream.read() >= 0;) { } ByteArrayOutputStream md5out = new ByteArrayOutputStream(); md5out.write(digest.digest()); return md5out.toString(); } catch (NoSuchAlgorithmException e) { throw new IllegalStateException("MD5 algorithm is not available: " + e); } finally { in.close(); } }
-
MrzResult.java
is a Java class to store the MRZ detection results, including detection confidence, text, and coordinates.
public class MrzResult { public int confidence; public String text; public int x1, y1, x2, y2, x3, y3, x4, y4; public MrzResult(int confidence, String text, int x1, int y1, int x2, int y2, int x3, int y3, int x4, int y4) { this.confidence = confidence; this.text = text; this.x1 = x1; this.y1 = y1; this.x2 = x2; this.y2 = y2; this.x3 = x3; this.y3 = y3; this.x4 = x4; this.y4 = y4; } }
-
MrzParser.java
is a Java class to parse the MRZ detection results and decode the MRZ information. The MRZ information includes the document type, issuing country, document number, date of birth, and expiration date, which are stored in acom.google.gson.JsonObject
object.
JsonObject mrzInfo = new JsonObject(); ... // Get issuing State infomation String nation = line1.substring(2, 7); pattern = Pattern.compile("[0-9]"); matcher = pattern.matcher(nation); if (matcher.matches()) return null; if (nation.charAt(nation.length() - 1) == '<') { nation = nation.substring(0, 2); } mrzInfo.addProperty("nationality", nation); // Get surname information line1 = line1.substring(5); int pos = line1.indexOf("<<"); String surName = line1.substring(0, pos); pattern = Pattern.compile("[0-9]"); matcher = pattern.matcher(surName); if (matcher.matches()) return null; surName = surName.replace("<", " "); mrzInfo.addProperty("surname", surName); // Get givenname information String givenName = line1.substring(surName.length() + 2); pattern = Pattern.compile("[0-9]"); matcher = pattern.matcher(givenName); if (matcher.matches()) return null; givenName = givenName.replace("<", " "); givenName = givenName.trim(); mrzInfo.addProperty("givenname", givenName); // Get passport number information String passportNumber = ""; passportNumber = line2.substring(0, 9); passportNumber = passportNumber.replace("<", " "); mrzInfo.addProperty("passportnumber", passportNumber); ...
When the Java classes are finished, we can automatically generate the JNI header file by running:
cd src/main/java
javah -o ../../../jni/NativeLabelRecognizer.h com.dynamsoft.dlr.NativeLabelRecognizer
Write JNI Wrapper for Dynamsoft C++ OCR SDK
We create a CMake project to build a JNI wrapper with Dynamsoft Label Recognizer SDK.
Here is the CMakeLists.txt
file:
cmake_minimum_required (VERSION 2.6)
project (dlr)
MESSAGE( STATUS "PROJECT_NAME: " ${PROJECT_NAME} )
find_package(JNI REQUIRED)
include_directories(${JNI_INCLUDE_DIRS})
MESSAGE( STATUS "JAVA_INCLUDE: " ${JAVA_INCLUDE})
# Check lib
if (CMAKE_HOST_WIN32)
set(WINDOWS 1)
elseif(CMAKE_HOST_APPLE)
set(MACOS 1)
elseif(CMAKE_HOST_UNIX)
set(LINUX 1)
endif()
# Set RPATH
if(CMAKE_HOST_UNIX)
SET(CMAKE_CXX_FLAGS "-std=c++11 -O3 -Wl,-rpath=$ORIGIN")
SET(CMAKE_INSTALL_RPATH "$ORIGIN")
SET(CMAKE_INSTALL_RPATH_USE_LINK_PATH TRUE)
endif()
# Add search path for include and lib files
if(WINDOWS)
link_directories("${PROJECT_SOURCE_DIR}/lib/win/" ${JNI_LIBRARIES})
elseif(LINUX)
link_directories("${PROJECT_SOURCE_DIR}/lib/linux/" ${JNI_LIBRARIES})
endif()
include_directories("${PROJECT_BINARY_DIR}" "${PROJECT_SOURCE_DIR}/include/")
# Add the library
add_library(dlr SHARED NativeLabelRecognizer.cxx)
if(WINDOWS)
target_link_libraries (${PROJECT_NAME} "DynamsoftLabelRecognizerx64")
else()
target_link_libraries (${PROJECT_NAME} "DynamsoftLabelRecognizer" pthread)
endif()
# Set installation directory
set(CMAKE_INSTALL_PREFIX "${PROJECT_SOURCE_DIR}/../src/main/")
set(LIBRARY_PATH "java/com/dynamsoft/dlr/native")
if(WINDOWS)
install (DIRECTORY "${PROJECT_SOURCE_DIR}/lib/win/" DESTINATION "${CMAKE_INSTALL_PREFIX}${LIBRARY_PATH}/win")
install (TARGETS dlr DESTINATION "${CMAKE_INSTALL_PREFIX}${LIBRARY_PATH}/win")
elseif(LINUX)
install (DIRECTORY "${PROJECT_SOURCE_DIR}/lib/linux/" DESTINATION "${CMAKE_INSTALL_PREFIX}${LIBRARY_PATH}/linux")
install (TARGETS dlr DESTINATION "${CMAKE_INSTALL_PREFIX}${LIBRARY_PATH}/linux")
endif()
This is a shared library project. The dlr
library is built from the NativeLabelRecognizer.cxx
file. All shared libraries will be installed to the src/main/java/com/dynamsoft/dlr/native
directory after building:
mkdir build
cd build
cmake ..
cmake --build . --config Release --target install
The JNI methods are implemented in the NativeLabelRecognizer.cxx
file:
-
Initialize the license:
JNIEXPORT jint JNICALL Java_com_dynamsoft_dlr_NativeLabelRecognizer_nativeInitLicense(JNIEnv *env, jclass, jstring license) { const char *pszLicense = env->GetStringUTFChars(license, NULL); char errorMsgBuffer[512]; // Click https://www.dynamsoft.com/customer/license/trialLicense/?product=dlr to get a trial license. int ret = DLR_InitLicense(pszLicense, errorMsgBuffer, 512); printf("DLR_InitLicense: %s\n", errorMsgBuffer); env->ReleaseStringUTFChars(license, pszLicense); return ret; }
-
Create the instance of Dynamsoft Label Recognizer:
JNIEXPORT jlong JNICALL Java_com_dynamsoft_dlr_NativeLabelRecognizer_nativeCreateInstance(JNIEnv *, jobject) { return (jlong)DLR_CreateInstance(); }
-
Destroy the instance of Dynamsoft Label Recognizer:
JNIEXPORT void JNICALL Java_com_dynamsoft_dlr_NativeLabelRecognizer_nativeDestroyInstance(JNIEnv *, jobject, jlong handler) { if (handler) { DLR_DestroyInstance((void *)handler); } }
-
Load the model file:
JNIEXPORT jint JNICALL Java_com_dynamsoft_dlr_NativeLabelRecognizer_nativeLoadModel(JNIEnv *env, jobject, jlong handler, jstring filename) { const char *pFileName = env->GetStringUTFChars(filename, NULL); char errorMsgBuffer[512]; int ret = DLR_AppendSettingsFromFile((void*)handler, pFileName, errorMsgBuffer, 512); printf("Load MRZ model: %s\n", errorMsgBuffer); env->ReleaseStringUTFChars(filename, pFileName); return ret; }
-
Detect MRZ from an image file and return a list of MRZ results:
JNIEXPORT jobject JNICALL Java_com_dynamsoft_dlr_NativeLabelRecognizer_nativeDetectFile(JNIEnv *env, jobject, jlong handler, jstring filename) { jobject arrayList = NULL; jclass mrzResultClass = env->FindClass("com/dynamsoft/dlr/MrzResult"); if (NULL == mrzResultClass) printf("FindClass failed\n"); jmethodID mrzResultConstructor = env->GetMethodID(mrzResultClass, "<init>", "(ILjava/lang/String;IIIIIIII)V"); if (NULL == mrzResultConstructor) printf("GetMethodID failed\n"); jclass arrayListClass = env->FindClass("java/util/ArrayList"); if (NULL == arrayListClass) printf("FindClass failed\n"); jmethodID arrayListConstructor = env->GetMethodID(arrayListClass, "<init>", "()V"); if (NULL == arrayListConstructor) printf("GetMethodID failed\n"); jmethodID arrayListAdd = env->GetMethodID(arrayListClass, "add", "(Ljava/lang/Object;)Z"); if (NULL == arrayListAdd) printf("GetMethodID failed\n"); const char *pFileName = env->GetStringUTFChars(filename, NULL); int ret = DLR_RecognizeByFile((void *)handler, pFileName, "locr"); if (ret) { printf("Detection error: %s\n", DLR_GetErrorString(ret)); } DLR_ResultArray *pResults = NULL; DLR_GetAllResults((void *)handler, &pResults); if (!pResults) { return NULL; } int count = pResults->resultsCount; arrayList = env->NewObject(arrayListClass, arrayListConstructor); for (int i = 0; i < count; i++) { DLR_Result *mrzResult = pResults->results[i]; int lCount = mrzResult->lineResultsCount; for (int j = 0; j < lCount; j++) { DM_Point *points = mrzResult->lineResults[j]->location.points; int x1 = points[0].x; int y1 = points[0].y; int x2 = points[1].x; int y2 = points[1].y; int x3 = points[2].x; int y3 = points[2].y; int x4 = points[3].x; int y4 = points[3].y; jobject object = env->NewObject(mrzResultClass, mrzResultConstructor, mrzResult->lineResults[j]->confidence, env->NewStringUTF(mrzResult->lineResults[j]->text), x1, y1, x2, y2, x3, y3, x4, y4); env->CallBooleanMethod(arrayList, arrayListAdd, object); } } // Release memory DLR_FreeResults(&pResults); env->ReleaseStringUTFChars(filename, pFileName); return arrayList; }
Build Java Jar Package with Resources and Dependencies
The target package should include Java classes, C++ library files, model files, and dependencies. By default, Maven will only include Java classes. To include C++ library files, model files, and dependencies, we need to add the following configuration to the pom.xml
file:
<build>
<resources>
<resource>
<directory>src/main/java</directory>
<excludes>
<exclude>**/*.md</exclude>
<exclude>**/*.h</exclude>
<exclude>**/*.lib</exclude>
<exclude>**/*.java</exclude>
</excludes>
</resource>
<resource>
<directory>res</directory>
</resource>
</resources>
<plugins>
<plugin>
<artifactId>maven-assembly-plugin</artifactId>
<configuration>
<descriptorRefs>
<descriptorRef>jar-with-dependencies</descriptorRef>
</descriptorRefs>
</configuration>
</plugin>
</plugins>
</build>
-
src/main/java
is the directory that contains the native library files, which are installed after building the JNI wrapper. -
res
is the directory that contains the model files. Its structure is as follows:
res │ └───model ├───MRZ.caffemodel ├───MRZ.json ├───MRZ.prototxt └───MRZ.txt
maven-assembly-plugin
is used to build dependencies into the target package for easy deployment.
Finally, run the mvn install assembly:assembly
command to generate a dlr-1.0.0-jar-with-dependencies.jar
file.
Steps to Build a MRZ Detector in Java
Now, let's create a Java MRZ detector with a few lines of code.
-
Get a 30-day FREE trial license of Dynamsoft Label Recognizer, and activate the license in the Java code.
NativeLabelRecognizer.setLicense("DLS2eyJoYW5kc2hha2VDb2RlIjoiMjAwMDAxLTE2NDk4Mjk3OTI2MzUiLCJvcmdhbml6YXRpb25JRCI6IjIwMDAwMSIsInNlc3Npb25QYXNzd29yZCI6IndTcGR6Vm05WDJrcEQ5YUoifQ==");
-
Create a
NativeLabelRecognizer
instance.
NativeLabelRecognizer labelRecognizer = new NativeLabelRecognizer();
-
Load the MRZ detection model:
labelRecognizer.loadModel();
-
Detect MRZ from an image file:
ArrayList<MrzResult> results = (ArrayList<MrzResult>)labelRecognizer.detectFile(fileName);
-
Get the MRZ information by decoding the MRZ lines:
String[] lines = new String[results.size()]; for (int i = 0; i < results.size(); i++) { lines[i] = results.get(i).text; } JsonObject info = MrzParser.parse(lines);
Try the Sample Code
java -cp target/dlr-1.0.0-jar-with-dependencies.jar com.dynamsoft.dlr.Test images/1.png