Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

read_frame bug: ForeignKey lookup fails if any Null values present #128

Open
odoublewen opened this issue Dec 29, 2020 · 2 comments
Open

Comments

@odoublewen
Copy link

ForeignKey lookup works as long as no null values are present. But I also have some models where null values are allowable, for example:

class Foo(models.Model):
    name = models.CharField(max_length=64, unique=True)
    description = models.CharField(max_length=128, unique=True)

    def __str__(self):
        return self.name


class Sample(models.Model):
    foo = models.ForeignKey(Foo, on_delete=models.PROTECT, null=True, blank=True)

read_frame returns expected results as long as all of the qs objects are non null for field foo:

In [25]: read_frame(Sample.objects.filter(Q(id=637)), fieldnames=['id', 'foo'])
Out[25]: 
    id          foo
0  637           XY

...but if one null is present, all rows in the df become Null.

In [26]: read_frame(Sample.objects.filter(Q(id=637)|Q(id=241)), fieldnames=['id', 'foo'])
Out[26]: 
    id          foo
0  241         None
1  637         None

Somewhat similar to #93 ?

@TheAbhilash23
Copy link

@odoublewen Please let me know if the following work around seems useful to you.

MyModel (parent) in models.py

class MyModel(models.Model):
    MyModelId = models.BigAutoField(
        _('Id'),
        primary_key=True
    )
    Name = models.CharField(
        _('Name'),
        max_length=225,
        null=True,
        blank=True
    )
    Date = models.DateField(
        _('Date'),
        null=True,
        blank=True
    )
    DateTime = models.DateTimeField(
        _('DateTime'),
        null=True,
        blank=True
    )
    Integer = models.IntegerField(
        _('Integer'),
        null=True,
        blank=True
    )
    Float = models.FloatField(
        _('Float'),
        null=True,
        blank=True,
    )

    def __str__(self):
        return f'{self.pk} - {self.Name}'

    class Meta:
        db_table = 'MyModel'
        verbose_name = 'My Model Object'
        verbose_name_plural = 'My Model Objects'

    @classmethod
    def get_dataframe(cls, instance=None):
        if not instance:
            qs = cls.objects.all()
            return read_frame(qs, ('MyModelId',
                                   'MyForeignKeyModels__Name',
                                   'MyForeignKeyModels__pk',
                                   'Name',
                                   'Date',
                                   'DateTime',
                                   'Integer',
                                   'Float',))

MyForeignKeyModel is defined as

class MyForeignKeyModel(models.Model):
    MyForeignKeyModelId = models.BigAutoField(
        _('Id'),
        primary_key=True
    )
    MyModel = models.ForeignKey(
        'MyModel',
        on_delete=models.CASCADE,
        related_name='MyForeignKeyModels',
        null=True,
        blank=True
    )
    Name = models.CharField(
        _('Name'),
        max_length=225,
        null=True,
        blank=True
    )
    Date = models.DateField(
        _('Date'),
        null=True,
        blank=True
    )
    DateTime = models.DateTimeField(
        _('DateTime'),
        null=True,
        blank=True
    )
    Integer = models.IntegerField(
        _('Integer'),
        null=True,
        blank=True
    )
    Float = models.FloatField(
        _('Float'),
        null=True,
        blank=True,
    )

    def __str__(self):
        return f'{self.pk} - {self.Name}'

    @classmethod
    def get_dataframe(cls, instance=None, *args, **kwargs):
        if not instance:
            qs = cls.objects.all()
            fields_list = []
            for field in cls._meta.fields:
                import ipdb
                ipdb.set_trace()
                fields_list.append(field.name)


            return read_frame(qs, ('MyForeignKeyModelId',
                                   'Name',
                                   'Date',
                                   'DateTime',
                                   'Integer',
                                   'Float',
                                   'MyModel__Name'))

    class Meta:
        db_table = 'MyForeignKeyModel'
        verbose_name = 'My Foreign Key Model'
        verbose_name_plural = 'My Foreign Key Models'

 

In [1]: from MyApp1.models import *

In [2]: df = MyModel.get_dataframe()
MyApp1.MyModel.MyModelId
<ManyToOneRel: MyApp1.myforeignkeymodel>
MyApp1.MyForeignKeyModel.Name
<ManyToOneRel: MyApp1.myforeignkeymodel>

FieldDoesNotExist => MyForeignKeyModel has no field named 'pk'
'Options' object has no attribute 'get_all_related_objects_with_model' :::: DEPRECATED FROM DJANGO 1.1
Django Docs for Deprecation


The result I get on calling the classmethod of MyModel is :

   MyModelId MyForeignKeyModels__Name  MyForeignKeyModels__pk       Name        Date                  DateTime  Integer    Float
0          1                 huyjgjhm                       1  masdfasdf  2023-04-28 2023-04-28 07:14:16+00:00        4    3.122
1          1                 bvnbvnvb                       2  masdfasdf  2023-04-28 2023-04-28 07:14:16+00:00        4    3.122
2          2                   321321                       5    ppolkjm  2023-04-28                       NaT     3423  123.000
3          2                   ij9045                       6    ppolkjm  2023-04-28                       NaT     3423  123.000
 


and the result on calling the classmethod of MyForeignKeyModel is :

In [3]: df
Out[3]: 
   MyForeignKeyModelId      Name        Date                  DateTime  Integer     Float MyModel__Name
0                    1  huyjgjhm  2023-04-28 2023-04-28 07:14:28+00:00      NaN       NaN     masdfasdf
1                    2  bvnbvnvb  2023-04-28                       NaT     45.0    34.000     masdfasdf
2                    3  sdfgdsfg  2023-04-28 2023-04-28 07:15:10+00:00    333.0  2342.000          None
3                    4  gfhju767  2023-04-28 2023-04-28 07:15:31+00:00  12323.0       NaN          None
4                    5    321321  2023-04-28 2023-04-28 07:16:06+00:00      NaN   112.000       ppolkjm
5                    6    ij9045  2023-04-28                       NaT   7744.0     1.025       ppolkjm
 

NOTE: There were 2 objects of MyModel and 6 objects of MyForeignKetModel
2 of them (id 3 and 4) had null value for the Foreign Key

@avtrosty
Copy link

avtrosty commented Aug 29, 2023

Good afternoon
Also faced with such a problem
If, when sampling from a model with a ForeignKey column, one record has a value, and the second one does not, then both values have None
And if both entries with ForeignKey were filled in, then everything will be fine.

I did a little research and realized that in django_pandas/utils.py

  def replace_pk(model):
      base_cache_key = get_base_cache_key(model)
  
      def get_cache_key_from_pk(pk):
          return None if pk is None else base_cache_key % str(pk)

and if one of the ForeignKey records arrives empty, then in the get_cache_key_from_pk function, the pk parameter comes as float for all filled records and NoneType for non-filled ones, and if all records have a ForeignKey other than Null, then pk comes as int. Well, from here, after executing the get_cache_key_from_pk function for pk with the float type, it adds '.0' to the identifier, which is then not found.

if you change instead to int(pk), then everything works

  def replace_pk(model):
      base_cache_key = get_base_cache_key(model)
  
      def get_cache_key_from_pk(pk):
          return None if pk is None else base_cache_key % int(pk)

Dear developers, please check this and change it in new versions if this is the right solution

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants