-
Notifications
You must be signed in to change notification settings - Fork 15
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[scalding] java.lang.IllegalArgumentException: given types array contains null #20
Comments
In cascading-jdbc we need a way to determine the database types, so that we can create the table correctly. You can either do that by giving the types in the TableDesc object or we can also determine/guess them by looking at the Fields. If you pass the types via TableDesc, we should not run into this problem. Can you try that? That being said, it is also a problem, if we have a null handling bug. I will see, if I can look into that in the coming days. |
Sorry I'm quite new to the code base, what is the mechanism to provide types to a There are many constructors in EDIT: What would be useful for me is a description of how this mechanism has changed since in 2.5.3 compared to 2.5.2. How as the type inferred before? That would help me change the Scalding source code to make this work again. |
I tried this snippet of code, which as far as I can tell, is the only type-related thing in val tableDesc = new TableDesc(tableName) //, columnNames, columnDefinitions, null)
tableDesc.completeFromFields(new Fields("value1", Integer.TYPE))
println("providing table desc")
println(tableDesc)
tableDesc The fields were set as expected in TableDesc, but did not solve the problem :(
Since that didn't work, I'll do my best to provide an overview of the problem as I understand it from jumping in the Cascading code. There are many constructors in // JDBCScheme.java#L199
public JDBCScheme( Class<? extends DBInputFormat> inputFormatClass, Class<? extends DBOutputFormat> outputFormatClass, String[] columns, String[] orderBy, String conditions, String[] updateBy ) {
this( inputFormatClass, outputFormatClass, columns, orderBy, conditions, -1, updateBy );
}
// which calls
public JDBCScheme( Class<? extends DBInputFormat> inputFormatClass, Class<? extends DBOutputFormat> outputFormatClass, String[] columns, String[] orderBy, String conditions, long limit, String[] updateBy ) {
this( inputFormatClass, outputFormatClass, new Fields( columns ), columns, orderBy, conditions, limit, updateBy != null ? new Fields(updateBy ) : null, updateBy );
} As you can see, a |
Sorry for the consecutive updates, testing things as I go. I've confirmed that using a |
There was indeed a bug with the type handling and version 2.5-4-wip-87 contains a fix. Please try that. One thing to keep in mind though is, that some RDMBs systems like postgres do not support automatic coercion/casting, meaning if we try to write a tuple with a String to an int field, it will fail. By using Fields instances with proper types, we can avoid that problem, since we do a proper type coercion in java, before handing the data to the JDBC driver. I added a small paragraph to the README about that. If this works for you, we can do a release of cascading-jdbc next week. |
Thanks @fs111. Does the current master include 2.5.4-wip-87? I've tried On Sat, Aug 2, 2014 at 8:06 AM, André Kelpe [email protected]
|
There is no master. The changes are in the wip-2.5 branch. |
Oops, sorry I meant HEAD instead of master. |
It's because the checking for nulls in Fields types is new in 2.5.5 // 2.5.4
public Fields( Comparable[] fields, Type[] types )
{
this( fields );
if( isDefined() && types != null )
{
if( this.fields.length != types.length )
throw new IllegalArgumentException( "given types array must be same length as fields" );
this.types = copyTypes( types, this.fields.length );
}
}
//2.5.5
public Fields( Comparable[] fields, Type[] types )
{
this( fields );
if( isDefined() && types != null )
{
if( this.fields.length != types.length )
throw new IllegalArgumentException( "given types array must be same length as fields" );
if( Util.containsNull( types ) )
throw new IllegalArgumentException( "given types array contains null" );
this.types = copyTypes( types, this.fields.length );
}
} |
Can you give me a minimal example, preferrably a test, that triggers your problem? I currently don't see, how there can be a null in the types array. |
We just use the constructor new JDBCScheme(
null, // inputFormatClass
null, // outputFormatClass
columnNames.toArray,
null, // orderBy
filterCondition.getOrElse(null),
updateByFields.toArray
) Which as far as I can tell gives no types to the fields, but is eventually expected. It seems to me all these constructors that don't directly provide a Fields argument as columns with types won't work at all. Could you give an example of the correct usage of these constructors? You mentioned providing types through TableDesc, but as far as I can tell there is no way to do that. You can find a minimal scalding example at the initial post of the ticket. I don't have the setup to run vanilla cascading applications, although if you really need it I can look into setting it up. I will look into whether writing a test is easier. |
Looking at the tests for JDBCScheme it looks like the test is written such that sink fields are just instantiated and passed with types. I jumped through cascading code (which I am very, very unfamiliar with so I could be completely wrong) but it looked to me like in practice these fields are actually derived from |
You are looking at the wrong branch. This test has not types on the fields: https://github.com/Cascading/cascading-jdbc/blob/wip-2.5/cascading-jdbc-core/src/test/java/cascading/jdbc/JDBCTestingBase.java#L214 If you create a Fields instance with types, you have to give exactly one type per field and none of them is allowed to be null. If you don't give types at all, it will work as well. That is what I fixed in 2.5.4-wip-87. Can you confirm, that you are using that version? |
I guess I was looking at the wrong test? On Wed, Aug 6, 2014 at 8:36 AM, André Kelpe [email protected]
|
Ah sorry, gave the branch another try and it worked. When you said a fix was in, I didn't realize you meant it was a very recent fix. I thought you were referring to a fix in the far past before this ticket. I had not pulled then wip-2.5 branch in a few days. Sorry about the confusion, and thanks for the fix! Could we do a new cascading-jdbc release as you mentioned? |
wip releases are published on every push. http://conjars.org/search?q=jdbc ckw On Aug 6, 2014, at 3:42 PM, Akihiro Matsukawa [email protected] wrote:
Chris K Wensel |
2.5.4 is released. Closing this one. |
This isn't exactly a bug in cascading-jdbc and more likely an in Scalding integration with it, but since I couldn't find any documentation or examples, thought I would file an issue and ask. I hope that's ok.
Scalding JDBC integration seems to have completely broken with 2.5.3 (and with current master).
Produces an error:
This seems to be caused by
fields.getType[i]
returningnull
here:cascading-jdbc/cascading-jdbc-core/src/main/java/cascading/jdbc/JDBCScheme.java
Line 752 in 5487ab4
This is part of the new
JDBCScheme
mechanism introduced in 2.5.3. Must all fields have types attached to them for jdbc integration to work now?The text was updated successfully, but these errors were encountered: