Rails Guides 7.2 in Spanish Translated with OpenAI

Isis Harris - Aug 20 - - Dev Community

railsguides.es

A while back one of the things I did to get familiar with Ruby on Rails was to translate them to Spanish, yes it takes a lot of time. And just like any documentation, it gets outdated rather quickly.

A little background

The original translation took up months of work before, after work or on the weekends.

MR to update Spanish translation

Since then, there was always the intent and very slow progress at updating to 7 version, but it never got done. But now, we can use Google translate or ChatGPT, so why would we really need them?

This month, I started to look up what current versions and if any work had been done with translations. I learned a couple of things.

1: I found one article about someone already using ai to translate them! This "Dear AI, can you translate the Rails Guide for me?"

2: The translation links were removed from rails in Oct 2023 It appears Rails wants to find a way to support them natively

3: Rails 7.2 had just come out.

All of this got me going to test out what OpenAi could do to see about translating them to Spanish, instead of doing them manually.

Thank you to Kevin, for taking the step to share your article a year ago about updating the Rails Guides to various languages. Started of with the same name of the files but most of the content has been edited.

So how much would this cost:

According to GPT

“The GPT-4 model can handle up to approximately 25,000 words in a single interaction, which includes both the input message and the response generated by the model. Since 25,000 words is roughly equivalent to 100,000 characters, this would be the upper limit for the total number of characters that can be processed in one interaction.

Given that this total includes both the message you send and the model's response, the maximum length of the message you could send would be somewhat less than 100,000 characters. A good estimate would be to assume that about half of this capacity could be used for your input message, meaning the maximum length of a message you could send might be around 50,000 characters, allowing enough space for a response of similar length.”

Wanted to know how much was this going to cost.
At the time of doing this.

costs

The gpt-4o-mini model is more cost efficient when we run the files to get translated.

than gpt-4o-2024-08-06 .

I did a couple of test runs with both of these model to compare the translations.

After a couple of test runs and comparison on the response output with the translation. I preferred the output for the gpt-4o-2024-08-06

The output came out closer to what I wanted it to look like even with the same prompts. Then as I started to read the content of the guides it was also my preference in the language it was using for the gpt-4o-2024-08-06 .

By default it also did not translate code blocks.

Few Items I did in my code:

I used the openai-ruby gem.

I also wanted to handle all of the translation of the files in a single method instead of having to give it explicit instructions and a separate method on the type of file. Was to just create a single method that handles the translation of the files.

I ended up with this.

method

So I am just sending in the file, and the prompts to train it.

A separate method for the 3 types of files in the guides. This was done just to give it different prompts.

Some of the differences that I saw between models and the initial work:

Bulleted items were coming back with dashes instead of bullet points.

Some of the headings in the markdown files were also coming out like:

*#Action Mailer Basics*

instead of

*Action Mailer Basics
====================
*

I also wanted to keep the hyperlinks in the same place in the markdown file instead of at the bottom of the page.

This also got rid of the error of the links within the page not working. So I no longer had to edit those manually.

The link to the project is: https://github.com/latinadeveloper/railsguides.es/tree/es-translation-7-2

There is a directory es-7-mini-model that has the files translated with the mini one. In case anyone is curious on the output and translation.

I only did the translation for the website and chose to not do the epub(kindle) for now.

One of the things I did run into was getting the 429 error, too many requests.

I didn’t do any work around doing the translation files in batches, so that will be a follow up on this project. Some inner page links are still not fully generated by openAi, further training is needed there.

Total tokens: 772322, Total cost: $3.86161

Screenshot from start to finish, including set up.

Image description

Breakdown of each file.

File Name Total Tokens Cost ($)
7_0_release_notes.md 5765 0.028825
7_1_release_notes.md 16434 0.08217
active_record_callbacks.md 12023 0.060115
active_record_composite_primary_keys.md 21377 0.106885
active_record_encryption.md 4598 0.02299
active_record_migrations.md 11944 0.05972
active_record_multiple_databases.md 29971 0.149855
active_record_postgresql.md 11692 0.05846
active_record_querying.md 12190 0.06095
active_record_validations.md 39487 0.197435
active_storage_overview.md 28775 0.143875
active_support_instrumentation.md 28302 0.14151
action_mailer_basics.md 15902 0.07951
api_app.md 17878 0.08939
autoloading_and_reloading_constants.md 9556 0.04778
configuring.md 14584 0.07292
api_documentation_guidelines.md 85618 0.42809
asset_pipeline.md 7723 0.038615
association_basics.md 23343 0.116715
caching_with_rails.md 41625 0.208125
classic_to_zeitwerk_howto.md 14966 0.07483
command_line.md 8878 0.04439
contributing_to_ruby_on_rails.md 13416 0.06708
debugging_rails_applications.md 18054 0.09027
development_dependencies_install.md 22538 0.11269
documents.yaml 4042 0.02021
engines.md 2288 0.01144
error_reporting.md 23233 0.116165
form_helpers.md 3673 0.018365
generators.md 29046 0.14523
getting_started.md 9454 0.04727
getting_started_with_devcontainer.md 33916 0.16958
i18n.md 2437 0.012185
index.html.erb 27371 0.136855
initialization.md 762 0.00381
layout.html.erb 9338 0.04669
layouts_and_rendering.md 3821 0.019105
maintenance_policy.md 26178 0.13089
plugins.md 2088 0.01044
rails_application_templates.md 7718 0.03859
rails_on_rack.md 3825 0.019125
routing.md 5289 0.026445
ruby_on_rails_guides_guidelines.md 29928 0.14964
security.md 2555 0.012775
testing.md 36505 0.182525
threading_and_code_execution.md 38265 0.191325
tuning_performance_for_deployment.md 5691 0.028455
working_with_javascript_in_rails.md 6391 0.031955
upgrading_ruby_on_rails.md 6372 0.03186
Total 772322 3.86161

After all of this. I still need to read them, and would love any other native Spanish speaker to contribute if things are off.

At the end of the day, I wanted to have them updated in case it helps out anyone that prefers to read it in their language and it was a fun way to try out OpenAi.

🐝 Inspired

.
Terabox Video Player