[sword-devel] Script to find a best fit v11n

Greg Hellings greg.hellings at gmail.com
Wed Jun 18 23:40:15 EDT 2025


My script eschews percentages because they seemed relatively pointless to
me for measuring a mismatch like this. Instead it gives a count of both Old
and New Testament osisIDs that it finds missing and another that it finds
unexpectedly for a given versification. If the total of either count is
fewer than 100, the IDs for that particular count are printed to the
console. It will do this for every registered versification in the version
of the library it was compiled against, allowing the user to select
whichever one seems best to them based on the results.

On Wed, Jun 18, 2025, 10:25 PM David Haslam <dfhdfh at protonmail.com> wrote:

> It’s not just the number of “missing” verses that should figure in the
> percentage score, but also the number of verses that get concatenated to
> the last one in a chapter.
>
> The differences in v11n for the Psalms will be especially significant for
> this, in that some v11n renumber many of them. Likewise for the last few
> chapters in the book of Job.
>
> Aside: It would be cool to enhance the utility emptyvss by providing a
> command line option that would ignore books that are not included in the
> scope parameter in the conf file.
>
> Regards,
>
> David
>
> On Thu, Jun 19, 2025 at 03:18, DM Smith <dmsmith at crosswire.org
> <On+Thu,+Jun+19,+2025+at+03:18,+DM+Smith+%3C%3Ca+href=>> wrote:
>
> David,
>
> Because it only considers the xml, scope is automatically built into it.
> It is only comparing what is present in the xml with what is part of the
> av11ns.
>
> It might be good to add the enumeration of missing verses.
>
> — DM
>
> On Jun 18, 2025, at 4:02 PM, David Haslam <dfhdfh at protonmail.com> wrote:
>
> Does it take account of the Scope key in the .conf file for a less than
> complete Bible ?
>
> David
>
> Sent from Proton Mail <https://proton.me/mail/home> for iOS
>
>
> On Wed, Jun 18, 2025 at 20:51, DM Smith < dmsmith at crosswire.org
> <On+Wed,+Jun+18,+2025+at+20:51,+DM+Smith+%3C%3Ca+href=>> wrote:
>
> Hi,
>
> Several have commented on how hard it is to test an OSIS xml file against
> v11ns especially since it goes off into an infinite loop. (I’ve posted a
> patch that fixes that) But it is still a process of trial and error to find
> an appropriate v11n.
>
> So, I’ve been iterating with chatGPT to create a python script to find a
> best fit v11n. Since I don’t know python, I can’t vouch for the script
> beyond it worked for a simple test case that had an extra chapter for
> Genesis and had some extra verses at the end of a chapter in that book.
>
> I offer it, as a starting place. See the attached file.
>
> It has a —debug flag.
> The first argument is expected to be the OSIS xml file.
> The second argument is optional and gives the location to the include
> directory of svn/sword/trunk/include with all the canon*.h files. If you
> don’t supply the argument, it uses the web to load the canon*.h files from
> https://www.crosswire.org/svn/sword/trunk/include.
>
> It will score the fitness of each of the v11ns. It gives the score as a %,
> but I don’t know what that means. I told it that it should prioritize book
> matches, then chapter matches and finally verse matches. I don’t know how
> well it did that scoring. I didn’t test for that.
>
> The output is alphabetized. If more than one v11n have the same high
> score, they are listed.
>
> In His Service,
> DM
>
> _______________________________________________
> sword-devel mailing list: sword-devel at crosswire.org
> http://crosswire.org/mailman/listinfo/sword-devel
> Instructions to unsubscribe/change your settings at above page
>
>
> _______________________________________________
> sword-devel mailing list: sword-devel at crosswire.org
> http://crosswire.org/mailman/listinfo/sword-devel
> Instructions to unsubscribe/change your settings at above page
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://crosswire.org/pipermail/sword-devel/attachments/20250618/ad8144c0/attachment.htm>


More information about the sword-devel mailing list