A Windows XP help forum. PCbanter

If this is your first visit, be sure to check out the FAQ by clicking the link above. You may have to register before you can post: click the register link above to proceed. To start viewing messages, select the forum that you want to visit from the selection below.

Go Back   Home » PCbanter forum » Windows 10 » Windows 10 Help Forum
Site Map Home Register Authors List Search Today's Posts Mark Forums Read Web Partners

Newspaper Tracking



 
 
Thread Tools Rate Thread Display Modes
  #16  
Old August 10th 18, 01:55 PM posted to alt.comp.os.windows-10
nospam
external usenet poster
 
Posts: 2,094
Default Newspaper Tracking

In article , Mayayana
wrote:


... it made me wonder whether a site
can really track visitors, realistically, by IP.


absolutely, along with a lot of other things.
Ads
  #17  
Old August 10th 18, 03:27 PM posted to alt.comp.os.windows-10
sxgvegovujvdhdfortughjk
external usenet poster
 
Posts: 3
Default Newspaper Tracking

On 8/10/2018 5:47 AM, Mayayana wrote:

Yet I can always read articles at NYT as long as I allow cookies.
Though I've never tried reaching their daily limit and deleting
cookies.


Daily limit? The current NYT limit is 5 articles per MONTH.

they claim their online subscriptions are very successful


Digital subscriber numbers have recently shot up in the newspaper
industry in general, a phenomenon some refer to as the “Trump bump"...
  #18  
Old August 10th 18, 09:04 PM posted to alt.comp.os.windows-10
Sam E[_2_]
external usenet poster
 
Posts: 193
Default Newspaper Tracking

On 08/10/2018 01:58 AM, chris wrote:

[snip]

Many websites don't even
have their own IP address.


False. All web domains have an IP address. Again by definition. Bigger
domains will exist on multiple IPs for resilience.


That one was true. Read the "THEIR OWN" part. Nothing was said about not
having an IP address.
  #19  
Old August 10th 18, 10:12 PM posted to alt.comp.os.windows-10
Mark Blain
external usenet poster
 
Posts: 79
Default Newspaper Tracking

Dave C wrote in
:

There are three people in my house. All of us read newspaper posts
from our prior home town paper; using the same Internet access
provider.

Each month, all of us quickly learn that we have exceeded the
monthly "article" limit. Is there any way I can reset. whatever
page view "counter" - so that each can have have their own
individual usage limit (not collective/ ala sum of our three
users)?


That's called a paywall. If you want the newspaper to stay in business,
consider helping them out. Having said that, some common workarounds
that may work include opening the article link with JavaScript disabled
(which some browsers allow) or opening it in private browsing mode.
  #20  
Old August 12th 18, 06:50 AM posted to alt.comp.os.windows-10
chris[_5_]
external usenet poster
 
Posts: 2
Default Newspaper Tracking

On Fri, 10 Aug 2018 08:47:46 -0400, "Mayayana"
wrote:

"chris" wrote

| I was under
| the impression that a cable connection is actually
| part of a party line, with hundreds of other customers
| sharing the same IP address.
|
| Not sure how cable companies do it, but ADSL and fibre connections all
have
| an IP allocated to them from the pool that their service provider owns. I
| imagine cable is not that different.
|

I wonder. There's been talk for some time of IP4
addresses being short.


https://en.wikipedia.org/wiki/IPv4_address_exhaustion


And as Char noted, in the
early days of cable, neighbors used to find each
other in their Network Neighborhood. It seems that
it would be cheaper and allow for more expansion
if cable companies can share IP addresses across
a group.


It's easier to give everyone a unique IP address. Large scale NAT is
called "Carrier Grade NAT" (CGNAT), but most ISPs don't want to deal
with its complexities, nor do they have the hardware or the capabilities
to do so. NAT in a SoHo router is one thing; NAT for an entire ISP is a
whole different world.


So it seems farfetched that everyone
online could have their own IP address. But I don't
know the details of how it works. I thought maybe
someone else might.


It's not farfetched at all, and lots of people know how it works. What
questions do you have?


| Many websites don't even
| have their own IP address.
|
| False. All web domains have an IP address.
| Again by definition.

They have an IP address by definition, but not
necessarily a dedicated IP address, which is what
I'm wondering about.


It's called virtual hosting.
https://en.wikipedia.org/wiki/Virtual_hosting

IIS, Internet Information Services, is included with every version of
Windows. IIS supports virtual hosting, so you can test it there. Just
create multiple sites in IIS, all using the same IP but different root
folders. IIS will check the Host header on each request to see which
site is being requested.


If you look at webhosting options you'll see that
a dedicated IP is sometimes an option. Probably the
cheapo servers like Dreamhost don't even offer it.
That limits how many customers they can have.


Not really. Putting it another way, virtual hosting limits the number of
IP address that are required.


And IP4 addresses have already run out.


No, they haven't.
https://en.wikipedia.org/wiki/IPv4_address_exhaustion


Shared IP means aaa.com, bbb.com and
ccc.com can all have the same IP address, which
points to their server. A requested page would
then be determined from the GET. So Dreamhost
doesn't have to dedicate either a device or an IP
to an individual customer. They just put each domain
in a separate folder on one machine and figure it
out as the GETS come in.


You're talking about the Host header, which is mandatory in HTTP/1.1
requests (what you call GETs). You can see it with any HTTP-aware tool.
The easiest is probably curl.exe with the -v option.


Given all that, it made me wonder whether a site
can really track visitors, realistically, by IP.


Generally, sites track visitors by cookie, but if cookies are disabled
at the client, they'll fall back to IP address. IP tracking is not
optimal, especially since an IP address could be used by different
people at different times, or different people at the same time.


In fact,
it's not unusual in my own web logs to see commercial
GETs coming from numerous, similar IPs, even for one
page and it's related images. And it's common (I don't
know why) to see things like an IP that resolves
to Brazil in terms of geolocation load a webpage,
followed by an IP from Europe that downloads a linked
file. Yet both show the same company in a hostname
resolution.


No idea what you're trying to say there.


-chris
  #21  
Old August 12th 18, 02:44 PM posted to alt.comp.os.windows-10
Mayayana
external usenet poster
 
Posts: 4,784
Default Newspaper Tracking

"chris" wrote

| It's easier to give everyone a unique IP address. Large scale NAT is
| called "Carrier Grade NAT" (CGNAT)

Ah. That's the term. I didn't know that. Looking
around online it looks like it's not so common, but I
didn't find definitive stats.

| If you look at webhosting options you'll see that
| a dedicated IP is sometimes an option. Probably the
| cheapo servers like Dreamhost don't even offer it.
| That limits how many customers they can have.
|
| Not really. Putting it another way, virtual hosting limits
| the number of IP address that are required.
|

That's saying the same thing. The number of IP
addresses they control is limited. If a host has a
block of 1,000 IP addresses they can host 1,000
sites, or they can do shared hosting and host any
number of sites. If you look at a place like
GoDaddy, their lower tier hosting is shared. For
$3/month they probably don't have an option,
and a lot of their hosting is for small businesses
that get no appreciable traffic.

|
| And IP4 addresses have already run out.
|
| No, they haven't.
| https://en.wikipedia.org/wiki/IPv4_address_exhaustion
|

Your link says they have.

| In fact,
| it's not unusual in my own web logs to see commercial
| GETs coming from numerous, similar IPs, even for one
| page and it's related images. And it's common (I don't
| know why) to see things like an IP that resolves
| to Brazil in terms of geolocation load a webpage,
| followed by an IP from Europe that downloads a linked
| file. Yet both show the same company in a hostname
| resolution.
|
| No idea what you're trying to say there.
|

I'm saying that a single visitor can have multiple
IPs. Search bots do that routinely. Here are some
other examples from my logs. I process them to resolve
hostname and geo-location. These were all single
visitors:

google-proxy-66-102-6-188.google.com.Mountain View-CA-US
google-proxy-66-102-6-186.google.com.Mountain View-CA-US
google-proxy-66-102-6-184.google.com.Mountain View-CA-US

bzq-82-80-249-143.dcenter.bezeqint.net.--Israel
bzq-82-80-230-228.cablep.bezeqint.net.Petah Tiqwa-HaMerkaz-Israel
bzq-82-80-249-164.dcenter.bezeqint.net.--Israel

115.239.212.134.Hangzhou-Zhejiang-China - -
115.239.212.139.Hangzhou-Zhejiang-China - -
115.239.212.134.Hangzhou-Zhejiang-China - -

I've also seen cases where remote locations share.
For instance, a visitor from Acme in one location visits
a page, but a distant location downloads the files.

Google proxy is an interesting one. I've been finding
it to be increasingly common, with several people
per day using it. Online I've found instructions for
using it to get around paywalls. But I suspect that what
I'm seeing is actually Google tracking people through
Chrome. In other words, Google seems to be acting
as a proxy server without asking, in order to track 100%
of Chrome users' activity.

Combining that with the fact that a company could
have one IP for numerous people, and people
behind routers in the same house will share IP,
the whole idea of paywalling by IP address doesn't
seem very realistic. But I don't know of any stats
about how common it is vs cookies vs possible other
methods.


  #22  
Old August 12th 18, 04:43 PM posted to alt.comp.os.windows-10
chris[_5_]
external usenet poster
 
Posts: 2
Default Newspaper Tracking

On Sun, 12 Aug 2018 09:44:58 -0400, "Mayayana"
wrote:

"chris" wrote

| It's easier to give everyone a unique IP address. Large scale NAT is
| called "Carrier Grade NAT" (CGNAT)

Ah. That's the term. I didn't know that. Looking
around online it looks like it's not so common, but I
didn't find definitive stats.


Right, it's not common because of the potential side effects.
Implementing CGNAT could disrupt the customer networks or the ISP
networks, or both.

From https://en.wikipedia.org/wiki/IPv4_address_exhaustion, with the
most relevant part being the third paragraph:

[quote]
Transition mechanisms

As the IPv4 address pool depletes, some ISPs will not be able to provide
globally routable IPv4 addresses to customers. Nevertheless, customers
are likely to require access to services on the IPv4 Internet. Several
technologies have been developed for providing IPv4 service over an IPv6
access network.

In ISP-level IPv4 NAT, ISPs may implement IPv4 network address
translation within their networks and assign private IPv4 addresses to
customers. This approach may allow customers to keep using existing
hardware. Some estimates for NAT argue that US ISPs have 5-10 times the
number of IPs they need in order to service their existing
customers.[87] This has been successfully implemented in some countries,
e.g., Russia, where many broadband providers use carrier-grade NAT, and
offer publicly routable IPv4 address at an additional cost.

However the allocation of private IPv4 addresses to customers may
conflict with private IP allocations on the customer networks.
Furthermore, some ISPs may have to divide their network into subnets to
allow them to reuse private IPv4 addresses, complicating network
administration. There are also concerns that features of consumer-grade
NAT such as DMZs, STUN, UPnP and application-level gateways might not be
available at the ISP level. ISP-level NAT may result in multiple-level
address translation which is likely to further complicate the use of
technologies such as port forwarding used to run Internet servers within
private networks.
[unquote]


|
| And IP4 addresses have already run out.
|
| No, they haven't.
| https://en.wikipedia.org/wiki/IPv4_address_exhaustion
|

Your link says they have.


It's more complicated than that. Large blocks of IP addresses,
especially classful allocations (Class A, B, or C) are no longer
available. The use of CIDR allows allocation of much smaller blocks, but
even so, what you need might not be available, so what happens is that
other organizations might return part of an earlier allocation, thus
making it available to be allocated to you, or an organization might go
out of business and sell off its address allocation, (for example,
"Microsoft bought 666,624 IPv4 addresses from Nortel's liquidation sale
for 7.5 million dollars in a deal brokered by Addrex)."

Also, a reminder from the quoted section above where they mention that
ISPs typically have 5-10 times as many addresses as they currently need,
to allow for subscriber growth, so "exhaustion" or "depletion" are both
too harsh to describe the current situation, and all of this only
applies to IPv4. IPv6 totally changes the landscape.

[quote]
Reclamation of unused IPv4 space

Before and during the time when classful network design was still used
as allocation model, large blocks of IP addresses were allocated to some
organizations. Since the use of Classless Inter-Domain Routing (CIDR)
the Internet Assigned Numbers Authority (IANA) could potentially reclaim
these ranges and reissue the addresses in smaller blocks. ARIN, RIPE NCC
and APNIC have a transfer policy, such that addresses can get returned,
with the purpose to be reassigned to a specific recipient.However, it
can be expensive in terms of cost and time to renumber a large network,
so these organizations are likely to object, with legal conflicts
possible. However, even if all of these were reclaimed, it would only
result in postponing the date of address exhaustion.

Similarly, IP address blocks have been allocated to entities that no
longer exist and some allocated IP address blocks or large portions of
them have never been used. No strict accounting of IP address
allocations has been undertaken, and it would take a significant amount
of effort to track down which addresses really are unused, as many are
in use only on intranets.

Several organizations have returned large blocks of IP addresses.
Notably, Stanford University relinquished their Class A IP address block
in 2000, making 16 million IP addresses available. Other organizations
that have done so include the United States Department of Defense, BBN
Technologies, and Interop.
[unquote]


I'm saying that a single visitor can have multiple
IPs. Search bots do that routinely.

....
I've also seen cases where remote locations share.
For instance, a visitor from Acme in one location visits
a page, but a distant location downloads the files.


Those cases will almost never be true for human clients. If the goal is
to walk a site, though, then it's certainly possible and even likely
that the tasks of retrieving initial html documents could land on one
server farm for parsing, then embedded links could be assigned to other
farms for further parsing and ultimate retrieval until all page elements
have been accounted for. Everything would get correlated in the end, but
while it's happening you'd see requests from who knows where.

I believe part of what you're seeing is simply GSLB in action.

[quote]
Global Server Load Balancing (GSLB) is a technology which directs
network traffic to a group of data centers in various geographical
locations. Each data center provides similar application services, and
client traffic is directed to the optimal site with the best performance
for each client. GSLB monitors the health and responsiveness of each
site, and like Server Load Balancing, directs traffic to the site with
the best response times.
[unquote]



-chris
  #23  
Old August 12th 18, 04:59 PM posted to alt.comp.os.windows-10
Mayayana
external usenet poster
 
Posts: 4,784
Default Newspaper Tracking

"chris" wrote

| I'm saying that a single visitor can have multiple
| IPs. Search bots do that routinely.
| ...
| I've also seen cases where remote locations share.
| For instance, a visitor from Acme in one location visits
| a page, but a distant location downloads the files.
|
| Those cases will almost never be true for human clients.

But they are. See the google-proxy example. That
was part of a of dozen or so GETs for one page. A
single visitor. I get a lot of those. All I'm saying is that
it's not realistic to block IPs when one visitor is showing
3 of them.

I'd still be interested if anyone has links about
the various methods of paywalling and their
popularity. So far the thread has only offered
educated guesses and hearsay.


 




Thread Tools
Display Modes Rate This Thread
Rate This Thread:

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off






All times are GMT +1. The time now is 05:27 AM.


Powered by vBulletin® Version 3.6.4
Copyright ©2000 - 2018, Jelsoft Enterprises Ltd.
Copyright ©2004-2018 PCbanter.
The comments are property of their posters.