ldami

Saturday, September 9, 2023

Perl cheese and wine

In case you didn't know from my previous posts, I've been programming in Perl for the last 25 years.

Now during one week of holidays in North Italy, Alto Adige, I accidentally discovered :

a cheese dairy called "Perl Hof". Unfortunately it was closed when I passed there, so I couldn't taste the products.
a wine called "Perl", done with grape "Lagrein", very specific of that region. Quite dark in color,rich and fruity, very pleasant ... I brought a couple of bottles home :-) Come and see me if you want to taste it !

Saturday, February 24, 2018

The 3rd generation of DBIx::DataModel is on CPAN

The 3rd generation of DBIx::DataModel has just been published to CPAN.

DBIx::DataModel is an object-relational mapping (ORM) framework for building Perl abstractions (classes, objects and methods) that interact with relational database management systems. It provides facilities for generating SQL queries, joining tables automatically, navigating through the results, converting values, assembling complex datastructures and packaging the results in various formats.Some of its strong points are :

centralized, UML-style declaration of tables and relationships (instead of many files with declarations such as 'has_many', 'belongs_to', etc.)
limited coupling with the database schema : there is no need to declare every column of every table; DBIx::DataModel only needs to know about tables, associations, primary keys and foreign keys
exposure of database operations like joins, bulk updates, subqueries, etc. The database is not hidden behind object-oriented programming concepts, as some other ORMs try to do, but rather made to explicitly collaborate with the object-oriented layer.
efficiency through a very lightweight infrastructure and through fine tuning of interaction with the DBI layer (prepare/execute, fetch into reusable memory location, etc.)
usage of SQL::Abstract::More for an improved API over SQL::Abstract (named parameters, additional clauses, simplified 'order_by', support for values with associated datatypes, etc.)
clear conceptual distinction between:
- data sources (tables and joins),
- database statements (stateful objects representing stepwise building of an SQL query and stepwise retrieval of results),
- data rows (lightweight hashrefs containing nothing but column names and values)
simple syntax for joins, with the possibility to override default INNER JOIN/LEFT JOIN properties, and with clever usage of Perl multiple inheritance for simultaneous access to the methods of all tables that participate in that join
nested, cross-database transactions
choice between 'single-schema' mode (default, more economical) and 'multi-schema' mode (optional, more flexible, but a little more costly in memory)
detailed documentation exposing not only the surface API but also the internal architecture and design principles

Initially published in 2005, DBIx::DataModel had a 1^st refactoring in 2008 and a 2^nd refactoring in 2011. Novelties brought by this 3^rd refactoring of 2018 are :

architectural simplification : suppression of the ConnectedSource class
new extensible architecture for result kinds produced by calls to select(). Builtin result kinds include various datastructure and file formats ; applications can easily plug additional result kinds.
facilities for switching between several database schemas within the same database connection
method do_after_commit() for registering coderefs to be executed after the end of the outermost transaction.
option join_with_USING for generating joins of shape « Table1 JOIN Table 2 USING (common_key) »
restructuring of the update() method for easier extension by application subclasses
complete revision of the documentation

Saturday, February 17, 2018

Very positive and refreshing articles on Perl 5 -- example for embedded systems

Michel Conrad recently wrote two very positive articles on Perl 5. Those were relayed on the LinkedIn Perl group but I haven't seen them on usual Perl sites, so I share them here in the hope they get propagated better. :

https://opensource.com/article/18/1/why-i-love-perl-5

https://opensource.com/article/18/1/my-delorean-runs-perl

The second article is very interesting in that it shows an unusual application area for Perl : realtime graphics for a car dashboard ! This reminded me of a very interesting presentation many years ago at the French Perl Workshop 2005 , where we saw an application driving all vehicles on the apron of Port Airport.

Sunday, September 7, 2014

back from Swiss Perl Workshop 2014

The second Swiss Perl Workshop just ended, and it was really nice.

I must confess that I was quite skeptical when Roman and Matthias first came up with such an idea, thinking that Swiss french people would rather go the Journées Perl in Paris, and Swiss german people would rather go to the Deutscher Perl Workshop. But I was wrong : people came, and I was very happy to encounter new local people doing Perl. The organization was tip-top : well-chosen location in the center of Switzerland, in a cosy, ancient house, nice food eaten in the garden, with good Valpolicella ripasso wine.

For this event I expanded my YAPC::EU lightning talk on virtual tables for SQlite : the expanded slides are online at http://www.slideshare.net/ldami/sq-lite-virtualtables. The other talk on App::AutoCRUD was the same as in YAPC (slides here).

Thanks again to Roman and Matthias, and see you next year.

Wednesday, July 30, 2014

Plack::App::* namespace is not for apps - so which is the proper CPAN namespace ?

OOps ... I just realized that I had misunderstood the intent of the Plack::App namespace : the top-level Plack doc explicitly says :

DO NOT USE Plack:: namespace to build a new web application or a framework. It's like naming your application under CGI:: namespace if it's supposed to run on CGI and that is a really bad choice and would confuse people badly.

and the 2009 Plack Advent Calendar goes even further with

Think twice before using Plack::App::* namespace. Plack::App namespace is for middleware components that do not act as a wrapper but rather an endpoint. Proxy, File, Cascade and URLMap are the good examples. If you write a blog application using Plack, Never call it Plack::App::Blog, okay? Name your software by what it does, not how it's written.

OK, sorry, I got this wrong when publishing Plack::App::AutoCRUD -- but to my excuse, I'm not alone, several other CPAN authors did the same.

The app is quite young, so it is still time to repair its name (even if this operation will be quite tedious, because it involves changes in all module sources, in the CPAN distribution, in the github repository name, and in the upcoming YAPC::EU::2014 talk). But if I want to be a good citizen and engage into such an operation, what should be the proper name ? The CPAN namespace is becoming a bit crowded, as already noted 2 years ago by Joel Berger. For choosing a name, there seem to be several controversial and perhaps contradictory principles :

CPAN is for modules, not for apps : this was argued in 2008 in a Perlmonk discussion on the same topic ; however, many people replied in disagreement. I disagree too : publishing a Perl app on CPAN fully makes sense because we take advantage of the CPAN infrastructure for tests, dependency management, publication, etc. Furthermore, applications can be extended or forked, just like modules, so CPAN is a perfect environment for sharing.
publish under the App::* namespace : this is the PAUSE recommendation. But applications in the App::* namespace are mainly command-line utilities, which is quite different from Web applications. As a matter of fact, nobody used yet the App::Web namespace -- maybe it's time to start ?
use a ::Web or ::WebApp suffix at the end of the module name : I never saw this as a recommendation, but nevertheless many distributions adopted this approach. This is certainly appropriate if the main goal is to publish a functionality Foo::Bar, and by the way, there is also a web app at Foo::Bar::WebApp. But if the purpose of the whole distribution is just a web app, this approach tends to create a new top-level namespace, which is not considered good practice. Should I choose AutoCRUD::WebApp ? I think not, because other people might want to use the AutoCRUD::* namespace.
avoid top-level namespaces : this used to be an important recommendation, but it doesn't seem to be well respected any more :-( -- nowadays I see more and more CPAN distributions taking up top-level names. I won't cite any particular example, not to offend anybody, but it's quite obvious if you look at the list of top-level namespaces .... and unfortunately many of those top-level names give no clue whatsoever about what kind of functionality will be found in the associated distribution.
hide the technology underlying your app : the Plack argument above says that the app should be named from its functionality, not from its implementation technology. Well ... I'm not so sure that this is always appropriate. Many modules sit under the Tie::Hash::* namespace, just because they used the tied hash technology, for providing various kinds of functionalities.
Concerning "Plack", when I see that keyword in a module name, I know that a) this is Web technology, and b) this will work on any kind of web server (as opposed to modules names containing "Apache" or "Apache2"), and I consider this to be useful information for a potential user. On the opposite, I didn't want to name my module DBIx::DataModel::AutoCRUD, even if it uses DBIx::DataModel quite heavily, because that's not hardwired into the architecture and I could easily imagine a later adaptation for supporting as well DBIx::Class.

So in the end I will probably end up with something like App::Web::AutoCRUD or WebApp::AutoCRUD ... unless somebody comes up with a better suggestion !

PS : see also Catalyst::Plugin::AutoCRUD .. which can be used either as a Catalyst plugin or as an application on its own.

Friday, July 11, 2014

Perl virtual tables for DBD::SQLite : ready to test

Followup to my previous article : a first draft of Perl virtual table support for DBD::SQLite is available at https://github.com/DBD-SQLite/DBD-SQLite/tree/vtab .

This is still alpha software, but it shows the idea; I still don't know when this work will be mature enough for a CPAN publication.

Two examples of virtual table modules are bundled with the distribution :

FileContent : implements a virtual column that exposes file contents. This is especially useful
in conjunction with a SQLite FTS fulltext index; see the doc in Fulltext_search.pod
PerlData : binds a virtual table to a Perl array within the Perl program. This can be used for simple import/export operations, for debugging purposes, for joining data from different
sources, etc.

I'm currently thinking of one more example , which would be fun to play with : a virtual table that would proxy to another DBI connection. Then we could join data from various sources, using SQLite's features to do the joining work. Sounds quite exciting, but at this point this is just an idea.

Thanks to Salvador Fandiño who pointed me to https://metacpan.org/pod/SQLite::VirtualTable, which is similar in idea but more meant to embed a Perl interpreter inside a sqlite application, rather than the other way around; this code helped me build the DBD::SQLite version.

Of course any comments/suggestions are welcome.

Friday, July 4, 2014

project : Perl virtual tables for DBD::SQLite

During the year I have more and more management tasks and less time for programming. So for the holidays I wanted a change and decided to engage into a really "hardcore programming" project, namely to add support for Perl virtual tables within the DBD::SQLite database driver. SQLite has this notion of "virtual tables" which look like regular tables but are implemented through callback routines. This project implies some C programming, using Perl XS API, and the delicate part is to design some appropriate glue between SQLite's notion of "object-oriented", through extensible C structures and callbacks, and Perl's object-oriented features.

At the beginning I wasn't even sure if such a project would be feasible, but now it is slowly taking shape and I'm pretty confident that it will eventually reach something usable. The concept is quite similar to Perl's tied variables, where a published API is reused for accessing many different kinds of data; except that here the published API is SQL instead of hashes or array operators. As a result, we could have virtual tables bound to the filesystem, to the Win32::Registry, to some configuration data, or any other accessible resource. This will open a new field for lots of creative ideas.

My main motivation for doing this work is to be able to build a framework for collections of documents, using SQLite for the fulltext index, and using the filesystem for storing document content : this will be a much more powerful replacement for my very old File::Tabular::Web::Attachment::Indexed module. That module is still heavily used at Geneva courts of law, but now we have 10 years of data, and the old architecture is clearly showing its limits.

For the virtual tables project I need some test cases, so if anybody has ideas about Perl-accessible data to be published as an SQL table, I'm interested.