Questions about permissions/data format

riddim · October 29, 2016, 7:59pm

heyhey

i was investigating the python wrapper and uploading/downloading some data

before creating some “pretty programmed functions” i thought i’d ask some questions for making things compatible to other apps:
…i tried to upload a data file via python (i just chose the background image of the demo-app-template)

since python is pretty picky in terms of data-encoding i had to read some basics on data encoding first before knowing whats happening and why uploading this jpg wasn’t as easy as i hoped xD …
python first encoded the file via ascii - that didn’t go well … so i chose to use ‘utf-8’ because that sounded to me like a good choice … and then it worked - i uploaded the file - downloaded it again - and the checksums are identical =) … so i wanted to implement a working binary-file-up/download in the safeAPI-python-library

question1: to me it seems like the demo app uploads the *.jpg encoded as ‘latin-1’ …? is there a reason for this? (if i’m correct) because i thought utf-8 would be more widespread …?

question2: is there an overview on permissions and what is allowed to do with which permission …?
(because i had problems to get some public data with the app that has low-leve-api-permission and i’m not sure if that is because of my wrong code/data or because of lack of permissions …?)

davidpbrown · October 29, 2016, 8:59pm

Have you looked at the list of libraries from March? =SAFE API Client Libraries

I’ve no idea the state of python in those… the Rust ones seem to have aged but I’ve been taking some inspiration from what’s there and trying to create a new set in Rust.

riddim · October 29, 2016, 9:20pm

thx - yes - i am using https://github.com/hintofbasil/PySafeAPI as starting point (and many things do work very well already =) if i get github running i want to help expanding this wrapper a little bit (i’m not experienced in python but i’ll do my very best to not produce too crappy code)

ben · October 30, 2016, 8:34am

That’s odd indeed. Could you reference the exact code line this is happening? I think you might only be looking at an HTTP-Header here. For that it wouldn’t matter because *.jpg and alike are binary anyways. Encoding only matters for any text-based information you display without a specific decoder.

Either way, the Demo app hasn’t been changed in a while – since the launch of the very first Alpha Network – and as it worked throughout (look at us providing stable APIs ), we didn’t spend much time on it lately. But it looks like we will soon though. So I think this is just a relict of earlier times, not a best practice or anything.

At this very moment there are only very few Permissions: LOW_LEVEL and DRIVE. Drive is an emulation of a posix-style NFS the the same data types that you can access and use with the low_level permissions. But as this said, the low_level permissions allow for other ways of access, too. However they are more crude.

However, we are working on a overhall of this entire system, as these permissions are currently only enforced by the Launcher/intermediary proxy to the network. And that is not a sustainable data access model for any alone-standing app (like on mobile). We already have some rough ideas how we want to do that instead, but we need to get the RFCs into a publish-able state first …

riddim · October 30, 2016, 9:24am

very cool you answered that fast - i’ll come back to you later … no time now … =)

thanks anyway!

WhiteOutMashups · October 30, 2016, 1:10pm

Interesting how popular the pySAFE python wrapper is; it’s the first thing I hear mentioned about SAFE whenever I bring it up at Meetups. Seems to be really popular or at least well known in the python community.

riddim · October 30, 2016, 1:40pm

thats because python is awesome xD

…ok i’m new to all this python thing and i’m not a good programmer at all …
…but give me some time and maybe i’ll manage to use python + pyQt for a GUI-programm → pyinstaller → pack it into an executable file for windows+linux(+mac will be possible as soon as i have one )

so … quick and easy program development => for end-users just download +> doubleclick

riddim · October 30, 2016, 3:29pm

sorry no - i only see symptoms
(i’ll focus on the last 15 symbols of the file to demonstrate what i mean)

this is what i get after uploading the template-website:

the same data encoded via utf-8 looks like:

i saved the picture to my disk and after reading it with python i get: (latin-1)

now depending on in which format i upload it i get different results when downloading it again:

via decode(xyz).encode(xyz) i can get the data back i wanted and everything … but that i can upload the exact same data and it looks differently seems a little bit strange to me

my guess is that the demo app uses the data default-pc-encoding for the upload (which is latin-1 with my pc) and not explicitly utf-8 … ?

ben · October 30, 2016, 4:02pm

I can’t quite follow you on what you are trying to do here:

You are reading the source of a binary file (jpg) directly from the stream (which is a python bytestream from what I can see) and compare it to what will be saved on disc? And because your disc and operating system use latin-1 you assume that this was encoded in latin-1? Binary data isn’t encoded on your filesystem/by you operation-system. Binary data is stored as is, encoding is only needed when you try to represent the same data as text to the user. You’d actually “destroy” the image data if you changed the encoding of a binary file (like a jpg).

Regarding through your third image: of course if you encode the same text differently, for many encodings you’ll come up with different content. Just do this in python stringData.decode('latin-1').encode('utf-8') == stringData.decode('latin-1').encode('latin-1') for the data you are using. Unless they are in the ascii-spectrum (which both Latin-1 and utf-8 have the same code-points for), this will give you different results. Which is clearly the case for the data in question (which is binary).

So when trying to read them as latin-1 and then store them again as latin-1 you’ll come back to what you initially had (with a lot of wasted memory), but if you changed the encoding you’d come up with something else. You could equally do stringData.decode('utf-8').encode('utf-8') and it would end up with \xff\xd9 as you didn’t actually do anything at all.

Neither of them are proof that data is stored/encoded as latin-1. Which also doesn’t make any sense, as we are talking about binary data.

We are using Rust internally, which takes whatever data you give it and stores it AS IS on the network. While this internally uses Vectors of u8 (so unicode-8-ish), we aren’t ever doing any string operations on the content, thus not doing any coding or things like that either, just store whatever is passed. If you upload a file, its binary content will be read and stored exactly like that on the network. No encoding.

You can learn more about the differences of String handling between Python and Rust in this excellent blog post.

riddim · October 30, 2016, 4:59pm

yeah … that is what i assumed first too … sorry yeah i should have posted it too to show what i meant … here comes the tricky part …

same is with downloading binary files from safenet

if i wouldn’t have had those problems initially i wouldn’t have said anything

but heyhey - that one is interesting!

hmhmmm … ok … i’ll just use utf-8 and stop wondering why all that happens…

dirvine · October 30, 2016, 5:01pm

I suspect this will be the simplest way forward. Let us know how that works for you.