The Apache SeaTunnel community has recently welcomed a new Committer, Dai Lai, a Big Data Architect from China Telecom Yikang. As a professional in the healthcare industry, he brings technical support related to this field to the SeaTunnel project, exploring the potential for combining medical data value mining and AI model applications with SeaTunnel. Let’s look at how he became a Committer in the SeaTunnel community!
Profile
- Title: Big Data Architect at China Telecom Yikang Tech
- GitHub ID: dailai
- Personal Interests: Skilled in big data technologies (data integration, stream processing, etc.) and enjoys basketball.
Q&A
1. What contributions have you made to the community?
- Added the Jdbc-Iris connector to support collecting data from Donghua’s healthcare database.
- Developed the Opengauss-CDC connector, enabling real-time data collection from Opengauss.
- Upgraded the Debezium version for Mysql-cdc from 1.6.4 to 1.9.8, improving the stability of real-time collection and supporting schema evolution for mysql-cdc to jdbc-mysql.
- In data lakes, I contributed to supporting real-time data ingestion and egress for Paimon.
- To make building platforms based on SeaTunnel easier, I developed a connector inspection script that quickly displays all parameters of connectors, making it easier to integrate various parameters.
- Additionally, I participated in the community’s technical discussions regarding Seatunnel on Yarn to support job-level resource isolation. I believe this feature will be available to users soon.
- PR collection: https://github.com/apache/seatunnel/pulls?q=is%3Apr+author%3Adailai+is%3Aclosed
2. Can you share the story behind your connection with Apache SeaTunnel?
As the demand for data-driven decision-making in the healthcare industry grows, exploring the value of healthcare data and unleashing the potential of new production forces is more urgent than ever. China Telecom Yikang has developed a "data platform" to manage the full lifecycle of healthcare data elements, creating a foundation for healthcare data operations that helps extract value from medical data and apply AI models. In this strategic context, the data integration platform serves as the "main artery" of our data platform, and we needed a solution that could be quickly implemented and meet the complex data integration needs of the platform.
3. How long have you been involved in open-source? What attracts you to it?
I have been involved in open-source since February this year. Initially, I used SeaTunnel frequently and read its documentation for our business needs. However, as our business requirements grew, I realized that certain functionalities did not meet our needs, so I started adding features and fixing bugs, contributing them to the community. The community members were very friendly, providing valuable suggestions and guidance, which strengthened my resolve to continue participating in open-source.
4. Have you previously researched data integration systems? Have you compared SeaTunnel with other competing products?
Yes, I have researched data integration, and I published an article titled "Experience Sharing on Building a Data Integration Platform Based on Apache SeaTunnel for the China Telecom Yikang Data Platform," which highlights the advantages of choosing Apache SeaTunnel.
5. Has your company used SeaTunnel? What is the use case? Have you developed any extensions based on SeaTunnel?
Our company has used SeaTunnel and is currently developing data integration features based on it. We support remote one-click deployment and full-chain data collection from hospital front-end nodes to the central platform.
6. What was your first impression of the SeaTunnel community? What do you hope to gain from it?
The community is very active, with lots of discussions. I hope to learn excellent coding techniques and functional designs.
7. What do you think are the most critical requirements for a data integration system? Does SeaTunnel meet these requirements? What optimizations or improvements do you expect SeaTunnel to make in the future?
The most critical requirements for data integration are ease of use, stability, fault tolerance, and a rich ecosystem. SeaTunnel currently meets these needs. In the future, I look forward to seeing the job-level resource isolation feature and its technical implementation.
8. What kind of personal growth support do you hope to gain from participating in the SeaTunnel community?
I hope to learn more excellent ideas and techniques.
9. What is your understanding of the role of a Committer? What responsibilities should a Committer take on in the community?
I believe that a Committer should be proactive in discovering and fixing bugs, adding features that meet community needs, and regularly reviewing others' pull requests to stay updated on new features, optimizations, or bug fixes.
10. How do you feel about being selected as a Committer? Is there anything you'd like to say to the community, or any suggestions for the project's development?
The strength of the community comes from everyone’s contributions. I hope that everyone actively contributes to make the community stronger and increase its impact, so that data integration can become the "main artery" of every data platform or intelligent platform.