cimm 3 months ago

> ethically sourced: opt-in only data collection

Good on them but how does this work? If my neighbour scans my WiFi network and uploads it to BeaconDB I didn’t exactly opt-in, did I? The privacy policy mentions you can add ‘_optout’ to the WiFi name, so it’s more opt-out instead of opt-in?

  • joelkoen 3 months ago

    This line refers to opting in to using your device to collect this data. Apple and Google are taking advantage of their global user coverage by using their devices to collect this data without their consent.

    Your WiFi network is broadcasting its presence 10 times a second in all directions. It is well known that you should not put sensitive information in your network SSID, for example, as anybody nearby can pick that up. Hence, you can opt out here instead.

    • dividuum 3 months ago

      While most users probably don't realize that they contribute to Wifi crowd sourcing, AFAIR using locations services is opt-in on iOS. So "without their consent" doesn't seem true. The info popup also explicitly mentions the WiFi location crowd sourcing.

      • axelthegerman 3 months ago

        Sure but any opt-in iOS user walking past other people's wifi is crowd sourcing those networks without the network operators consent.

        Unless they only contribute networks that the device has authenticated with.

  • FireInsight 3 months ago

    The person collecting the data opted in to doing it, heh. As far as the data collectors are concerned, your wifi is out in the public.

  • fc_on_hn 3 months ago

    > If my neighbour scans my WiFi network and uploads it to BeaconDB I didn’t exactly opt-in, did I?

    To clarify: all phones doing geolocation are already uploading your AP macaddr to remote location services, but BeaconDB will *not* publish this information in cleartext.

    Any data dump will contain only non-reversible cryptographically hashed data or aggregated data.

    • kevincox 3 months ago

      A MAC address is only 48 bits and some of the bits are restricted. It is well within the range of brute force to reverse all of the hashes.

      • joelkoen 3 months ago

        You can truncate the hash to cause collisions, meaning that one MAC address does not map to one location. This requires the client to be aware of multiple physically nearby MACs in order to get a location, as it then needs to estimate which "possible" locations are most likely.

        This is a really interesting problem, and I've loved thinking about it recently. If you're keen on it too I'm happy to discuss further, feel free to reach out.

      • userbinator 3 months ago

        To put that into perspective, 48 bits is 256T, which is roughly the number of bits in a 32TB hard drive.

        • account42 3 months ago

          > and some of the bits are restricted

      • landdownsundar 3 months ago

        Absolutely right, great point. That's why I only use Windows addresses now. Can't break those with brute force!

  • petre 3 months ago

    You can opt to hide your SSID and use 5GHz WiFi which doesn't reach too far, gets attenuated through walls, so it's basically kind of useless as a geolocation beacon.

denysvitali 3 months ago

Last time I looked into something like this for GrapheneOS it wasn't possible to provide a custom location service.

It would be awesome to have this on GrapheneOS - so I'm very happy if someone knows a way to do this without using microG (I use the sandboxed GMS)

dangoodmanUT 3 months ago

The author doesn't seem to have an open source mobile app or anything that would allow them to source the data from devices themselves. I'm curious where the data was collected from, esp. if it was opt-in (at the collecting device)

  • joelkoen 3 months ago

    I haven't built any apps for contributing to beaconDB as of yet. The website links to NeoStumbler and TowerCollector, which are Android apps that can be used to collect this data.

    • dangoodmanUT 3 months ago

      Thanks, based on the copy I thought it was recently opened to contribution, and the original dataset had come from somewhere else.

      • dangoodmanUT 3 months ago

        I am curious what would cause such a distributed user base to contribute to this though?

        • joelkoen 3 months ago

          Distributed referring to the community not yet recognising one specific software as "the go to"? Or distributed physically?

          • dangoodmanUT 3 months ago

            Physically! Like how so many users from all over the place decided to contribute to this

            • joelkoen 3 months ago

              It is rather surprising how many people have started contributing already. I believe that people want to support alternatives to big tech so they aren't completely reliant on these providers, and beaconDB is currently the only database not owned by big tech. Not 100% sure that answers your question :)

              • dangoodmanUT 3 months ago

                Gotcha, I guess I was asking whether people specifically opted in to contributing to beaconDB, sounds like that's the case

a2800276 3 months ago

Wasn't the main issue with MLS that they got patent trolled/sued by Skyhook? Anyone know the patents involved and how beacon DB is avoiding the issues?

FireInsight 3 months ago

Reading the MLS retirement issue[1] it seems that multiple established organizations (e foundation, Graphene) are also interested in providing an alternative service. Does this mean that we're now in a situation where multiple open source location service providers are competing, or is this the only publicly accessible project in this space for now?

This project is cool and all, but seems to just be a one person effort with not a lot of engagement on GitHub[2]. Are you in talks with other people with similar goals to expand and collaborate on the project? Having the backing of an existing developer community could really bring this to the next level.

1) https://github.com/mozilla/ichnaea/issues/2065

2) https://github.com/beacondb/beacondb

Edit: the actual project seems to be on Codeberg[3], where there is a bit more engagement from others than the primary dev.

3) https://codeberg.org/beacondb

  • joelkoen 3 months ago

    beaconDB is currently the only publicly accessible project, but I am currently discussing working together with various other projects and organisations.

    The project was originally on GitHub, but it has now moved to Codeberg.

    • jrexilius 3 months ago

      How is this different from WiGLE?

      • iczero 3 months ago

        WiGLE is very expensive to use.

  • gnufx 3 months ago

    For what it's worth, /e/ OS is now using its own location service, but I don't know what, if anything, restricts access to it.

k__ 3 months ago

Is there a reason the API doesn't return the locations of the access points so the clients can calculate their positions by themselves?

  • joelkoen 3 months ago

    This is planned to help clients cache data locally, which would improve the privacy of the client and reduce server load. I would like to implement this over the next few days.

    I have not yet found any clients that have implemented making use of such data, please let me know if you have found one or are developing one.

    • k__ 3 months ago

      Ah, okay.

      I was just thinking if there were any technical constraints preventing this.

      Because you mention Ichnaea API compatibility, and I didn't know if that spec even allows that.

jacooper 3 months ago

Hope GrapheneOS adds support for this soon, as currently their non-Google GPS Provider is basically hopeless unless you are outside.

yxOverKill 3 months ago

This is such a cool project. Always glad to see problem solvers filling the void left by MLS. (Unrelated, but the design looks great!)

  • joelkoen 3 months ago

    Thank you, this means a lot!

chenfeiyu132 3 months ago

Curious if the last data dump from MLS can still be downloaded anywhere? I can't seem to find it online. I'm working on a project that locates the connected tower based on mcc, mnc, cid, etc. Currently only sourcing data from opencellid and combain, this would be a great addition!

disparate4927 3 months ago

Really nice, hopefully more software switches to this, I'm 100% gonna contribute

dangoodmanUT 3 months ago

Is this only offered as an API? E.g. you can't dump it and analyze locally?

  • dangoodmanUT 3 months ago

    > data dumps are currently not available as I'm still researching the measures I need to take to protect the privacy of both contributors and AP owners.

    Ah

    • joelkoen 3 months ago

      Yes, I really want to be able to release data dumps as this opens up a lot of great opportunities. I'm also worried that people may have lost trust in a service like MLS now that it has shutdown and abandoned all of the data contributors had collected.

      At the moment, there simply isn't enough data to anonymise contributions.

chaz6 3 months ago

As nobody has yet mentioned it, there is also WiGLE [1] which has tracked over a billion unique networks.

[1] https://wigle.net/

  • jrexilius 3 months ago

    I was just going to ask, what ever happened to WiGLE and why build a clone of it rather than add to it?

    • acheong08 3 months ago

      WiGLE severely rate limits their APIs and don’t even allow normal people to pay for more access. They refuse to provide a data dump since they sell it for enterprise. No academic access either.

      People literally spend their time mapping APs and they don’t even get anything in return

      • iJohnDoe 3 months ago

        The couple of times I did a lookup it was woefully outdated as well.