Show Posts

This section allows you to view all posts made by this member. Note that you can only see posts made in areas you currently have access to.


Messages - TA

Pages: [1] 2
1
General Discussion about just::thread / Re: ATOMIC_VAR_INIT
« on: February 17, 2011, 03:38:49 PM »
Got it. Thanks.

2
General Discussion about just::thread / ATOMIC_VAR_INIT
« on: February 17, 2011, 03:22:57 PM »
Anthony,

Could you please explain what ATOMIC_VAR_INIT is used for? I couldn't find any useful information on that (or maybe I just coulnd't get it). As far as I understood it is used to initialize atomic variables in some special way, but what kind of initialization is done? When it might be useful? All the examples I could find was
atomic<int> t = ATOMIC_VAR_INIT(2);
why not just
atomic<int> t = 2;
what's the difference?

3
General Discussion about just::thread / Re: unaligned atomic variables
« on: February 17, 2011, 03:19:56 PM »
Another thing that can be done is putting atomic classes between
Code: [Select]
#pragma pack(push)
#pragma pack(4)
// class atomic ...
#pragma pack(pop)


That wouldn't help: it just affects the layout of the atomic objects, not the alignment of the whole object.
Yes, that's right. Didn't think about that. Thank you.

4
General Discussion about just::thread / Re: unaligned atomic variables
« on: February 17, 2011, 03:19:11 PM »
MSVC is inconsistent: with #pragma pack(1) it obeys the alignment requirements sometimes, but not others.

gcc always ignores alignment requirements with #pragma pack(1).

I'll put asserts in the atomic ops.
Thanks.

5
General Discussion about just::thread / Re: unaligned atomic variables
« on: February 17, 2011, 03:12:46 PM »
Another thing that can be done is putting atomic classes between
Code: [Select]
#pragma pack(push)
#pragma pack(4)
// class atomic ...
#pragma pack(pop)

6
General Discussion about just::thread / Re: unaligned atomic variables
« on: February 17, 2011, 02:59:07 PM »
Yes, I guess you have a base class where the actual data is stored. The constructor might do that check. Alternatively you can do something like this
Code: [Select]
template <typename T>
class atomic_base {
   // T data;
   __declspec (align(4)) T data;
};

The code already does that, so everything will be correctly aligned by default. However, if you use #pragma pack(1) then you're deliberately asking the compiler to ignore such alignment specifications, so it doesn't help.
hmmm..
#pragma pack affects default alignment, but it shouldn't affect explicit alignment.
This code outputs 8 for the size and true for is_aligned call on my machine inspite of #pragma pack(1).
Code: [Select]
#include <iostream>

#pragma pack(1)

struct X
{
char a;
__declspec (align(4)) long b;
};

bool is_aligned(void *ptr, int boundary)
{
return ((uintptr_t)ptr & (boundary - 1)) == 0;
}


int main()
{
X x;
std::cout << sizeof(X) << std::endl;
std::cout << is_aligned(&x.b, 4) << std::endl;
}
using this structure as amember of another unaligned structure will still keep our variable b properly aligned.
However, I'm not sure about gcc.

If this is not portable enough then assert is still good solution too.

7
General Discussion about just::thread / Re: unaligned atomic variables
« on: February 17, 2011, 02:46:39 PM »
Yes. Alternatively you can do something like this
Code: [Select]
template <typename T>
class atomic_base {
   // T data;
   __declspec (align(4)) T data;
};

8
General Discussion about just::thread / unaligned atomic variables
« on: February 16, 2011, 08:55:15 AM »
Hi Anthony,

It might be useful to have assertion for unaligned atomic variables. For example:
Code: [Select]
#pragma pack(1)

struct X
{
std::atomic_char a;
std::atomic_long b;
};
As plain loads and stores are guaranteed to be atomic on x86 only in case of aligned read/write this structure may cause problems. In example above atomic_long is not properly aligned (I guess internal representation too), so I think it will be good idea to notify user (at least in debug mode) that library is not going to work as expected. Or alternatively you can explicitly align internal variable to the required boundary (I think this one is a better solution).
What do you think?

Here is the example that produces data race with atomic variables. In the loop I'm trying to place one variable into two different cache lines. I changed alignment explicitly, however it might be changed from compiler settings and not be so obvious (although even problems caused by explicit change might look unobvious for people that are not aware of memory alignment requirement for atomic operations).

Code: [Select]
#include <iostream>
#include <thread>
#include <atomic>

void thread1(std::atomic_long& x)
{
for (int i = 0; i < 1000000; ++i)
{
x.store(0);
long value = x.load();
assert(value == 0 || value == ~0);
}
}

void thread2(std::atomic_long& x)
{
for (int i = 0; i < 1000000; ++i)
{
x.store(~0);
long value = x.load();
assert(value == 0 || value == ~0);
}
}

#pragma pack(push)
#pragma pack(1)
struct X
{
char alignment;
std::atomic_long x;
};
#pragma pack(pop)

int main()
{
X arr[100];
for (int i = 0; i < 100; ++i)
{
std::cout << i << std::endl;
arr[i].x = 0;
std::thread thread(thread1, std::ref(arr[i].x));
thread2(arr[i].x);
thread.join();
}
}

Thanks.

9
Hi Anthony,

Thanks for the new release.
Just want to note that there is no std::copy_exception function in n3225 anymore, it is replaced with std::make_exception (18.8.5, page 470).

If I see anything else I'll post it here.

Thanks.

10
General Discussion about just::thread / Re: Thread Local Storage
« on: February 03, 2011, 03:06:58 PM »
Thanks.

11
General Discussion about just::thread / Re: Thread Local Storage
« on: February 03, 2011, 02:53:11 PM »
Maybe. boost::thread_specific_ptr is slightly different again: though each pointer is freed automatically, you have to manually allocate and construct the object for each thread.
It is, but at least it allows to have thread local objects and not to worry about freeing it up afterwards, which is simply impossible with __declspec(thread).
As far as I know boost::thread_specific is handled in special way in boost::thread calling procedure, so I can't use boost::thread_specific in std::thread, is it right?

12
General Discussion about just::thread / Re: Thread Local Storage
« on: February 03, 2011, 02:36:18 PM »
There is currently no support for TLS in just::thread. Given that the compilers supported by just::thread do have TLS support, I will consider adding it for a future release.
Thanks for your answer Anthony.

Currently I'm using compilers support for TLS, but both GCC and MSVC have the same issue, they don't allow to have objects with non-trivial constructor in thread local storage, as far as I understood from C++ Concurrency in Action examples this is allowed in C++0x, maybe something like boost::thread_specific would be more useful here?

13
General Discussion about just::thread / Thread Local Storage
« on: February 03, 2011, 02:20:45 PM »
Hello Anthony,

Thread Local Storage is added as a keyword to C++ standard and it's not possible to have it in library without any special support by compiler, however it would be useful to have a portable TLS in just::thread. I couldn't find anything like that in documentation and forum, so could you please tell me if there is TLS support in just::thread?

Thanks.

14
Thanks a lot!

15
Thanks for your answer Anthony.

There is still one thing I don't get. To do not ask wrong question based on possible wrong understanding let me at first state what I understand and please fix me if I'm wrong.

We have 4 type of fences in C++1x, they are

memory_order_release - which is no-op on x86
Prevents compiler and hardware from reordering stores, i.e. any store that is done before this should be completed before the fence.
Store to variable with memory_order_release flag means
atomic_thread_fence(memory_order_release);
store();


memory_order_acquire - which is no-op on x86
Prevents compiler and hardware from reordering loads, i.e. any load that is done after this should happen after the fence.
Load from variable with memory_order_acquire flag means
load();
atomic_thread_fence(memory_order_acquire);


memory_order_acq_rel - which is again no-op on x86
Prevents compiler and hardware from reordering loads with load part of operation and stores with store part of operation.
Operation with memory_order_acq_rel flag means
atomic_thread_fence(memory_order_release);
operation();
atomic_thread_fence(memory_order_acquire);


memory_order_seq_cst - which is mfence on x86
Full memory fence, prevents compiler from reordering both loads and stores.

Now, on page 18 of above mentioned article Bartosz says that Peterson Lock is an example of algorithm, which will not work without fences. However, Peterson's lock uses only acquire, release and acq-rel.
http://www.justsoftwaresolutions.co.uk/threading/petersons_lock_with_C++0x_atomics.html
If those operations are no-op on x86 then Peterson Lock has to work on x86 without any fence. Am I wrong somewhere?
For Dekker's algorithm it's clear, it uses sequentially consistent memory fence, so it requires fencing on x86 too, but what about Peterson lock?

Quote
though such a fence can also be achieved with a LOCKed RMW instruction, such as XCHG.
On Visual Studio MemoryBarrier() function is implemented as a not locked RMW instruction.
http://msdn.microsoft.com/en-us/library/ms684208(v=vs.85).aspx
Quote
LONG Barrier;
__asm {
     xchg Barrier, eax
}
Is it a typeo in MSDN?

Quote
Plain loads and stores on x86 give you acquire (for loads) and release (for stores) ordering. This is sufficient for many algorithms, but not for others. LFENCE and SFENCE are primarily of use with non-temporal stores such as MOVNTI, which don't obey the normal cache coherency rules (that's what "non-temporal" means, in effect --- they don't occur at a particular time).
So it means that there is no analogs for sfence and lfence in C++1x, is it right?

Thanks a lot.

Pages: [1] 2