Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

All cells are empy #149

Open
woigl opened this issue Feb 6, 2018 · 4 comments
Open

All cells are empy #149

woigl opened this issue Feb 6, 2018 · 4 comments

Comments

@woigl
Copy link

woigl commented Feb 6, 2018

I am loading an XLSX file and I get the correct worksheet name and also the correct number of rows and columns(cells).

Worksheets: 1
Columns: 59
Rows: 29075

The issue is that all the cells printed as empty by using the following code:

$Reader = new SpreadsheetReader('/var/www/tmp/ArtikeldatenStandard.xlsx');
$Sheets = $Reader -> Sheets();

foreach ($Sheets as $Index => $Name)
{
    echo 'Sheet #'.$Index.': '.$Name;
    
    $Reader -> ChangeSheet($Index);
    
    foreach ($Reader as $Row)
    {
        print_r($Row);
    }
}

I don't want to post here the XLSX file. In case someone needs it for debugging purpose, then please let me know how I should provide the file (email, FTP, ...)
@woigl
Copy link
Author

woigl commented Feb 16, 2018

I've resolved the issue by myself.

After opening the .xlsx file with Excel and saving it again, it seemed to work. Therefore I've analyzed the issue further.

I figured out that the .xlsx files have a slight difference in the sst element, which messes up the function PrepareSharedStringCache(). The function is looking for the count attribute, which was in my case not there and therefore it exited at the if condition.

The working sst element:

<sst xmlns="http://schemas.openxmlformats.org/spreadsheetml/2006/main" count="1640" uniqueCount="809">

The not working sst element:

<sst xmlns="http://schemas.openxmlformats.org/spreadsheetml/2006/main" uniqueCount="809">

To resolve the matter, I've added the check for uniqueCount in case if count is not available. The corrected functions looks like this:

private function PrepareSharedStringCache()
{
	while ($this -> SharedStrings -> read())
	{
		if ($this -> SharedStrings -> name == 'sst')
		{
			$this -> SharedStringCount = $this -> SharedStrings -> getAttribute('count');
			if (!$this -> SharedStringCount) $this -> SharedStringCount = $this -> SharedStrings -> getAttribute('uniqueCount');
			break;
		}
	}

	if (!$this -> SharedStringCount || (self::SHARED_STRING_CACHE_LIMIT < $this -> SharedStringCount && self::SHARED_STRING_CACHE_LIMIT !== null))
	{
		return false;
	}

	$CacheIndex = 0;
	$CacheValue = '';
	while ($this -> SharedStrings -> read())
	{
		switch ($this -> SharedStrings -> name)
		{
			case 'si':
				if ($this -> SharedStrings -> nodeType == XMLReader::END_ELEMENT)
				{
					$this -> SharedStringCache[$CacheIndex] = $CacheValue;
					$CacheIndex++;
					$CacheValue = '';
				}
				break;
			case 't':
				if ($this -> SharedStrings -> nodeType == XMLReader::END_ELEMENT)
				{
					continue;
				}
				$CacheValue .= $this -> SharedStrings -> readString();
				break;
		}
	}

	$this -> SharedStrings -> close();
	return true;
}

If someone has a better solution, please drop me a message.

@woigl
Copy link
Author

woigl commented Feb 16, 2018

Does this project still have active contributors?

If not, can someone add me as a contributor to add my changes, or do I need to clone the whole project?

@kAlvaro
Copy link

kAlvaro commented Jul 19, 2018

In other words:

Index: SpreadsheetReader_XLSX.php
===================================================================
--- SpreadsheetReader_XLSX.php	(revision 645)
+++ SpreadsheetReader_XLSX.php	(working copy)
@@ -435,6 +435,7 @@
 				if ($this -> SharedStrings -> name == 'sst')
 				{
 					$this -> SharedStringCount = $this -> SharedStrings -> getAttribute('count');
+					if (!$this -> SharedStringCount) $this -> SharedStringCount = $this -> SharedStrings -> getAttribute('uniqueCount');
 					break;
 				}
 			}

It's complicated (not your fix, your fix is great and it's apparently the only way to use this library to read cells that contain text). The project is clearly abandoned and there're 23 pending pull requests. There're some forks that incorporate some basic bugfixes (I'm currently trying https://github.com/virtua-network/spreadsheet-reader) but... go figure. I presume nobody wants to marry this library and keep an up-to-date fork with all the fixes.

@woigl
Copy link
Author

woigl commented Nov 23, 2020

@kAlvaro it seems there is no current maintainer assigning anyone of us as maintainers for this project. I would love to take care of it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants